Winter Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Databricks Databricks-Certified-Data-Engineer-Associate Based on Real Exam Environment

Databricks Certified Data Engineer Associate Exam Questions and Answers

Question 17

A data engineer is designing a data pipeline. The source system generates files in a shared directory that is also used by other processes. As a result, the files should be kept as is and will accumulate in the directory. The data engineer needs to identify which files are new since the previous run in the pipeline, and set up the pipeline to only ingest those new files with each run.

Which of the following tools can the data engineer use to solve this problem?

Options:

A.

Unity Catalog

B.

Delta Lake

C.

Databricks SQL

D.

Data Explorer

E.

Auto Loader

Question 18

Which of the following data workloads will utilize a Gold table as its source?

Options:

A.

A job that enriches data by parsing its timestamps into a human-readable format

B.

A job that aggregates uncleaned data to create standard summary statistics

C.

A job that cleans data by removing malformatted records

D.

A job that queries aggregated data designed to feed into a dashboard

E.

A job that ingests raw data from a streaming source into the Lakehouse

Question 19

A data analyst has developed a query that runs against Delta table. They want help from the data engineering team to implement a series of tests to ensure the data returned by the query is clean. However, the data engineering team uses Python for its tests rather than SQL.

Which of the following operations could the data engineering team use to run the query and operate with the results in PySpark?

Options:

A.

SELECT * FROM sales

B.

spark.delta.table

C.

spark.sql

D.

There is no way to share data between PySpark and SQL.

E.

spark.table

Question 20

Which tool is used by Auto Loader to process data incrementally?

Options:

A.

Spark Structured Streaming

B.

Unity Catalog

C.

Checkpointing

D.

Databricks SQL