Winter Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Free and Premium Databricks Databricks-Certified-Data-Analyst-Associate Dumps Questions Answers

Databricks Certified Data Analyst Associate Exam Questions and Answers

Question 1

Which of the following approaches can be used to ingest data directly from cloud-based object storage?

Options:

A.

Createan external table while specifying the DBFS storage path to FROM

B.

Create anexternal table while specifying the DBFS storage path to PATH

C.

It is not possible to directly ingest data from cloud-based object storage

D.

Create an external table while specifying the object storage path to FROM

E.

Create an external table while specifying the object storage path to LOCATION

Buy Now
Question 2

The stakeholders.customers table has 15 columns and 3,000 rows of data. The following command is run:

After runningSELECT * FROM stakeholders.eur_customers, 15 rows are returned. After the command executes completely, the user logs out of Databricks.

After logging back in two days later, what is the status of thestakeholders.eur_customersview?

Options:

A.

The view remains available and SELECT * FROM stakeholders.eur_customers will execute correctly.

B.

The view has been dropped.

C.

The view is not available in the metastore, but the underlying data can be accessed with SELECT * FROM delta. `stakeholders.eur_customers`.

D.

The view remains available but attempting to SELECT from it results in an empty result set because data in views are automatically deleted after logging out.

E.

The view has been converted into a table.

Question 3

A data analyst has been asked to produce a visualization that shows the flow of users through a website.

Which of the following is used for visualizing this type of flow?

Options:

A.

Heatmap

B.

IChoropleth

C.

Word Cloud

D.

Pivot Table

E.

Sankey

Question 4

Which of the following statements describes descriptive statistics?

Options:

A.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

B.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

C.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

D.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

E.

A branch of statistics that uses quantitative variables that must take on an uncountable set of values.

Question 5

A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard.

Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?

Options:

A.

Delta Lake

B.

Databricks Notebooks

C.

Tableau

D.

Databricks Machine Learning

E.

Databricks SQL

Question 6

In which of the following situations will the mean value and median value of variable be meaningfully different?

Options:

A.

When the variable contains no outliers

B.

When the variable contains no missing values

C.

When the variable is of the boolean type

D.

When the variable is of the categorical type

E.

When the variable contains a lot of extreme outliers

Question 7

A data analyst has created a Query in Databricks SQL, and now they want to create two data visualizations from that Query and add both of those data visualizations to the same Databricks SQL Dashboard.

Which of the following steps will they need to take when creating and adding both data visualizations to the Databricks SQL Dashboard?

Options:

A.

They will need to alter the Query to return two separate sets of results.

B.

They will need to add two separate visualizations to the dashboard based on the same Query.

C.

They will need to create two separate dashboards.

D.

They will need to decide on a single data visualization to add to the dashboard.

E.

They will need to copy the Query and create one data visualization per query.

Question 8

Delta Lake stores table data as a series of data files, but it also stores a lot of other information.

Which of the following is stored alongside data files when using Delta Lake?

Options:

A.

None of these

B.

Table metadata, data summary visualizations, and owner account information

C.

Table metadata

D.

Data summary visualizations

E.

Owner account information

Question 9

A data analyst has a managed table table_name in database database_name. They would now like to remove the table from the database and all of the data files associated with the table. The rest of the tables in the database must continue to exist.

Which of the following commands can the analyst use to complete the task without producing an error?

Options:

A.

DROP DATABASE database_name;

B.

DROP TABLE database_name.table_name;

C.

DELETE TABLE database_name.table_name;

D.

DELETE TABLE table_name FROM database_name;

E.

DROP TABLE table_name FROM database_name;

Question 10

Which of the following statements about a refresh schedule is incorrect?

Options:

A.

A query can be refreshed anywhere from 1 minute lo 2 weeks

B.

Refresh schedules can be configured in the Query Editor.

C.

A query being refreshed on a schedule does not use a SQL Warehouse (formerly known as SQL Endpoint).

D.

A refresh schedule is not the same as an alert.

E.

You must have workspace administrator privileges to configure a refresh schedule

Question 11

A data analyst runs the following command:

INSERT INTO stakeholders.suppliers TABLE stakeholders.new_suppliers;

What is the result of running this command?

Options:

A.

The suppliers table now contains both the data it had before the command was run and the data from the new suppliers table, and any duplicate data is deleted.

B.

The command fails because it is written incorrectly.

C.

The suppliers table now contains both the data it had before the command was run and the data from the new suppliers table, includingany duplicate data.

D.

The suppliers table now contains the data from the new suppliers table, and the new suppliers table now contains the data from the suppliers table.

E.

The suppliers table now contains only the data from the new suppliers table.

Question 12

Which of the following describes how Databricks SQL should be used in relation to other business intelligence (BI) tools like Tableau, Power BI, and looker?

Options:

A.

As an exact substitute with the same level of functionality

B.

As a substitute with less functionality

C.

As a complete replacement with additional functionality

D.

As a complementary tool for professional-grade presentations

E.

As a complementary tool for quick in-platform Bl work