Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Free Data-Engineer-Associate Questions Attempt

AWS Certified Data Engineer - Associate (DEA-C01) Questions and Answers

Question 9

A company uses an Amazon Redshift Single-AZ cluster for enterprise analytics. The company wants to set up a highly resilient disaster recovery (DR) solution for the cluster. The solution must meet a recovery time objective (RTO) of less than 1 hour.

Which solution will meet this requirement MOST cost-effectively?

Options:

A.

Use a Redshift dense storage (DS2) node. Enable Multi-AZ deployment.

B.

Use a Redshift RA3 node. Enable Multi-AZ deployment.

C.

Configure a Redshift cluster from a cross-Region snapshot copy in a second AWS Region when necessary.

D.

Use a Redshift RA3 node. Enable cluster relocation.

Question 10

A media company uses software as a service (SaaS) applications to gather data by using third-party tools. The company needs to store the data in an Amazon S3 bucket. The company will use Amazon Redshift to perform analytics based on the data.

Which AWS service or feature will meet these requirements with the LEAST operational overhead?

Options:

A.

Amazon Managed Streaming for Apache Kafka (Amazon MSK)

B.

Amazon AppFlow

C.

AWS Glue Data Catalog

D.

Amazon Kinesis

Question 11

A company runs an extract, transform, and load (ETL) job in AWS Glue. The job processes personally identifiable information (PII) data and writes logs to an Amazon CloudWatch Logs log group. A data engineer needs to mask PII data in the CloudWatch Logs log group.

Which solution will meet these requirements?

Options:

A.

Attach an AWS Glue security configuration to the ETL job.

B.

Configure a data protection policy. Attach the policy to the CloudWatch log group.

C.

Run an Amazon Macie sensitive data discovery job.

D.

Call AWS Glue sensitive data detection APIs in the ETL job.

Question 12

A data engineer is using AWS Glue to build an extract, transform, and load (ETL) pipeline that processes streaming data from sensors. The pipeline sends the data to an Amazon S3 bucket in near real-time. The data engineer also needs to perform transformations and join the incoming data with metadata that is stored in an Amazon RDS for PostgreSQL database. The data engineer must write the results back to a second S3 bucket in Apache Parquet format.

Which solution will meet these requirements?

Options:

A.

Use an AWS Glue streaming job and AWS Glue Studio to perform the transformations and to write the data in Parquet format.

B.

Use AWS Glue jobs and AWS Glue Data Catalog to catalog the data from Amazon S3 and Amazon RDS. Configure the jobs to perform the transformations and joins and to write the output in Parquet format.

C.

Use an AWS Glue interactive session to process the streaming data and to join the data with the RDS database.

D.

Use an AWS Glue Python shell job to run a Python script that processes the data in batches. Keep track of processed files by using AWS Glue bookmarks.