New Year Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Free and Premium CompTIA DA0-001 Dumps Questions Answers

Page: 1 / 30
Total 396 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Given the image below:

Which of the following file formats is depicted?

Options:

A.

JSON

B.

CSV

C.

XML

D.

HTML

Buy Now
Question 2

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

Options:

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Question 3

A data analyst has a set with more than 40.000 rows in the sample schema below:

The analyst would like to create one column that contains the customers’ birth dates. Which of the following data quality dimensions would BEST explain the reason for compilation?

Options:

A.

Data accuracy

B.

Data completeness

C.

Data duplication

D.

Data integrity

Question 4

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 5

Andy is a pricing analyst for a retailer. Using a hypothesis test, he wants to assess whether people who receive electronic coupons spend more on average.

What should Andy's null hypothesis be?

Options:

A.

People who receive electronic coupons spend more on average.

B.

People who receive electronic coupons spend less on average.

C.

People who receive electronic coupons do not spend more on average.

D.

People who do not receive electronic coupons spend more on average.

Question 6

Which of the following is an example of a strategy to reduce statistical errors?

Options:

A.

Removing outliers

B.

Adding more data

C.

Transformation

D.

Recoding data

Question 7

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

Options:

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Question 8

Given the following table:

Which of the following describes the data quality issues with theagedata?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Question 9

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 10

Given the diagram below:

Which of the following data schemas shown?

Options:

A.

Key-value pairs

B.

Online transactional processing

C.

Data Lake

D.

Relational database

Question 11

Which of the following types of analysis would be best for an analyst to use to examine the relationships between authors who cited other authors in a library of research papers?

Options:

A.

Linguistic analysis

B.

Trend analysis

C.

Link analysis

D.

Performance analysis

Question 12

A customer survey reveals 90% positive feedback. Which of the following statistical methods would be best to utilize to determine the reliability of a data set and predict how a larger sample of customers over the same time period might respond?

Options:

A.

Calculate a high variance on survey responses.

B.

Calculate the maximum range of the survey responses.

C.

Calculate a low standard deviation on survey responses.

D.

Remove any data more than 4 standard deviation from the mean.

Question 13

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

Options:

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Question 14

Which of the following data governance concepts fits into the security requirements category?

Options:

A.

Data transmission

B.

Data deletion

C.

Data use agreements

D.

Personally identifiable information

Question 15

A survey asks participants to rate a company on a scale of one to ten. Which of the following best describes the rating variable?

Options:

A.

Continuous

B.

Ordinal

C.

Categorical

D.

Nominal

Question 16

An analyst is compiling a series of reports for the new executive board to review. Which of the following elements provides a snapshot of what is contained in the reports for the executives who do not have time to focus on the details?

Options:

A.

Tables

B.

Reference data sources

C.

Observations and insights

D.

Instruction page

Question 17

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 18

A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?

Options:

A.

Change the data type of the Date column to text.

B.

Trim the date.

C.

Round the hour of the Date column to the start of the hour.

D.

Split the Date column into two columns—time and date.

Question 19

Which of the following is concatenate typically used to combine?

Options:

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Question 20

An analyst develops an IT document and needs to describe the technical terms used in the document. Which of the following is where the analyst should include descriptions of the technical terms?

Options:

A.

Glossary

B.

System diagram

C.

User requirements

D.

Index

Question 21

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?

Options:

A.

Missing data

B.

Duplicate data

C.

Redundant data

D.

Invalid data

Question 22

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 23

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 24

Which of the following is most likely to be used as a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 25

A business unit made the following modification to the values in a table:

Which of the following data quality dimensions was applied in this scenario?

Options:

A.

Integrity

B.

Consistency

C.

Completeness

D.

Accuracy

Question 26

Which of the following best defines SCD?

Options:

A.

A technique used to profile data.

B.

A technique used to sort large data sets.

C.

A technique used to archive data.

D.

A technique used to manage historical data changes.

Question 27

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

Which of the following functions will allow the data analyst to organize the table in this manner?

Options:

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Question 28

Given the table below:

Which of the following variable types BEST describes the “Year” column?

Options:

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Question 29

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 30

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

Options:

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Question 31

Which of the following is a KPI metric for tracking sales performance?

Options:

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Question 32

A data analyst reviews the following data set:

Which of the following is the range value?

Options:

A.

9

B.

10

C.

12

D.

13

Question 33

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 34

Which of the following types of data manipulation functions should a data analyst use to implement a YES/NO condition in a spreadsheet?

Options:

A.

Text

B.

Statistical

C.

Financial

D.

Logical

Question 35

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 36

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 37

An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?

Options:

A.

ETL

B.

API

C.

SQL

D.

ELT

Question 38

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 39

The process of performing initial investigations on data to spot outliers, discover patterns, and test assumptions with statistical insight and graphical visualization is called:

Options:

A.

a t-test.

B.

a performance analysis.

C.

an exploratory data analysis.

D.

a link analysis.

Question 40

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 41

While reviewing survey data, an analyst notices respondents entered “Jan,” “January,” and “01” as responses for the month of January. Which of the following steps should be taken to ensure data consistency?

Options:

A.

Delete any of the responses that do not have “January” written out.

B.

Replace any of the responses that have “01”.

C.

Filter on any of the responses that do not say “January” and update them to “January”.

D.

Sort any of the responses that say “Jan” and update them to “01”.

Question 42

Which of the following database types is the best to use for transactional SQL?

Options:

A.

Snowflake schema

B.

Hierarchical

C.

Relational

D.

Star schema

Question 43

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 44

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

non-relational schema.

B.

galaxy schema.

C.

snowflake schema.

D.

star schema.

Question 45

Which of the following is the best variable formal to store a customer's age using the least possible amount of storage data?

Options:

A.

Int

B.

Float

C.

Char

D.

Double

Question 46

Which of the following would be used to store unstructured data from different sources?

Options:

A.

A data lake

B.

A database management system

C.

A database

D.

A data warehouse

Question 47

A data analyst is reviewing SQL code and sees a query that uses terms such as MIN, SUM, and COUNT. Which of the following types of functions best describes these terms?

Options:

A.

Aggregate

B.

Logical

C.

Filtering

D.

System

Question 48

A data analyst needs to create a master file that includes customer information from the tables below:

Given the three tables above, the analyst wants to filter down the information prior to joining it together. In which of the following orders should this data manipulation bo approached for the most efficient result?

Options:

A.

Merge, append, deduplicate

B.

Merge, deduplicate, append

C.

Deduplicate, append, merge

D.

Append, deduplicate, merge

Question 49

A data analyst must separate the column shown below into multiple columns for each component of the name:

Which of the following data manipulation techniques should the analyst perform?

Options:

A.

Imputing

B.

Transposing

C.

Parsing

D.

Concatenating

Question 50

The ACME Corporation hired an analyst to detect data quality issues in their Excel documents. Which of the following are the most common issues? (Select TWO)

Options:

A.

Apostrophe.

B.

Commas.

C.

Symbols.

D.

Duplicates.

E.

Misspellings.

Question 51

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to BEST display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chat using the site and percentage of new customers data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 52

Given the following table:

Which of the following methods is the best way to describe the changes in the values in the table?

Options:

A.

Average

B.

Range

C.

Standard deviation

D.

Median

Question 53

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

Options:

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Question 54

Given the below:

Which of the following numbers represents a Type I error?

Options:

A.

1

B.

2

C.

3

D.

4

Question 55

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

Options:

A.

Four

B.

Five

C.

Six

D.

Seven

Question 56

An analyst is building a new dashboard for a user. After an initial conversation with the user. the analyst created a mock-up of the dashboard. Which of the following best explains why the analyst created the mock-up?

Options:

A.

To identify the dimensions and measures

B.

To send to the client after deploying the dashboard to production

C.

To confirm important details before dashboard development begins

D.

To receive client approval for the final dashboard design

Question 57

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 58

A sales manager requested a report that contains the first name, last name, and phone number of all the company’s customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

Options:

A.

FULL OUTER JOIN

B.

INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Question 59

Amanda needs to create a dashboard that will draw information from many other data sources and present it to business leaders.

Which one of the following tools is least likely to meet her needs?

Options:

A.

QuickSight.

B.

Tableau.

C.

Power BI.

D.

SPSS Modeler.

Question 60

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 61

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

Options:

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Question 62

A data engineer is creating a database field to capture whether a customer likes vanilla ice cream. Which of the following data types is the best to capture this information?

Options:

A.

Integer

B.

Boolean

C.

Categorical

D.

Numeric

Question 63

A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?

Options:

A.

t-test

B.

Chi-squared test

C.

Rank sum test

D.

Ratio test

Question 64

A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?

Options:

A.

Static

B.

Real-time

C.

Self-service

D.

Dynamic

Question 65

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 66

Which of the following best describes a 95% confidence interval?

Options:

A.

There is a 95% probability that a sample is within one standard deviation of the mean.

B.

A stated range may contain 95% of the population mean, 95% of the time.

C.

A set of ranges contains the population mean with 95% certainty.

D.

A range contains 95% of the population mean.

Question 67

A data analyst is helping a retail store categorize its customers into five different groups based on the following information:

• How recently the customers made purchases

• How frequently the customers made purchases

• How much the customers spent

Given the following information:

Which of the following would be most important for the analysis?

Options:

A.

CustomerJD. Channel, Order_Date

B.

CustomerJD, Territory. Amount

C.

CustomerJD, Order_Date. Amount

D.

CustomerJD. Quantity, Amount

Question 68

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?

Options:

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Question 69

A data analyst has removed the outliers from a data set due to large variances. Which of the following central tendencies would be the best measure to use?

Options:

A.

Range

B.

Mean

C.

Mode

D.

Median

Question 70

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

Options:

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Question 71

Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?

Options:

A.

Referencing

B.

Strategic

C.

Operational

D.

Technical

Question 72

A large data download was divided into two smaller files. Which of the following describes the best way to fix this issue?

Options:

A.

Blending the two data sets

B.

Appending the two data sets

C.

Merging the two data sets

D.

Aggregating the two data sets

Question 73

A data analyst is developing a dashboard to track and monitor metrics. Which of the following best practices should be taken into during the FIRST pment process?

Options:

A.

Create a A Aupirarrame:

B.

Deploy to production.

C.

Copy a dashboard design from the Internet.

D.

Develop a dashboard.

Question 74

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power B1

C.

IBM SPSS

D.

Python

Question 75

What SQL command is used to delete an entire table from a database?

Options:

A.

DROP.

B.

MODIFY.

C.

DELETE.

D.

ALTER.

Question 76

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

Options:

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Question 77

A data analyst needs to observe the relationship between two numeric variables and identify the clustering pattern as well as the outliers. Which of the following visualizations should the analyst use?

Options:

A.

Heat map

B.

Tree map

C.

Scatter plot

D.

Stacked chart

Question 78

A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.

Which of the following query optimization techniques would effectively prevent SQL Injection attacks?

Options:

A.

Indexing.

B.

Subset of records.

C.

Temporary table in the query set.

D.

Parametrization.

Question 79

Which of the following activities occurs during the ETL process?

Options:

A.

Reviewing and addressing missing values

B.

Creating a dashboard

C.

Inserting a pivot table and pivot chart

D.

Multiplying unique data

Question 80

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 81

Given the following data set:

Which of the following is the best reason for cleansing the data?

Options:

A.

Duplicate data

B.

Imputed data

C.

Redundant data

D.

Corrupt data

Question 82

An analyst is reporting on the average income for a county and is reviewing the following data:

Which of the following is the reason the analyst would need to cleanse the data in this data set?

Options:

A.

Data completeness

B.

Data outliers

C.

Duplicate data

D.

Missing values

Question 83

An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?

Options:

A.

Derived

B.

Categorical

C.

Continuous

D.

Control

Question 84

During data profiling, an analyst decides to recode the status column in the following data set:

Which of the following data concerns explains why the analyst wants to take this action?

Options:

A.

Redundancy

B.

Duplication

C.

Invalidity

D.

Inconsistency

Question 85

A database administrator is required to mask certain table columns containing Pll in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

Options:

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Question 86

A data analyst has been asked to create an ad-hoc sales report for the Chief Executive Officer (CEO).

Which of the following should be included in the report?

Options:

A.

The sales representatives' home addresses.

B.

Line-item SKU numbers.

C.

YTD total sales.

D.

The customers' first and last names.

Question 87

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

Options:

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Question 88

Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?

Options:

A.

Systematic

B.

Simple random

C.

Convenience

D.

Stratified

Question 89

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 90

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 91

Which of the following describes the use of a representative amount of data from a main repository?

Options:

A.

Observation

B.

Delta load

C.

Web scraping

D.

Sampling

Question 92

Which of the following best describes the process of examining data for statistics and information about the data?

    Cleansing

Options:

A.

search

B.

Profiling

C.

Governance

Question 93

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

Options:

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Question 94

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 95

An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

Options:

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Question 96

An analyst has been tracking company intranet usage and has been asked to create a chat to show the most-used/most-clicked portions of a homepage that contains more than 30 links. Which of the following visualizations would BEST illustrate this information?

Options:

A.

Scatter plot

B.

Heat map

C.

Pie chart

D.

Infographic

Question 97

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 98

Exhibit.

Which of the following logical statements results in Table B?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 99

Which of the following is a characteristic of a relational database?

Options:

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Question 100

Which of the following contains alphanumeric values?

Options:

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Question 101

Which of the following reports can be used when insight into operational performance is needed each Wednesday?

Options:

A.

Static report

B.

Tactical report

C.

Recurring report

D.

Ad hoc report

Question 102

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

Options:

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Question 103

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 104

Which of the following is a process that is used during data integration to collect, blend, and load data?

Options:

A.

MDM

B.

ETL

C.

OLTP

D.

BI

Question 105

A data analyst was asked to create a visual representation of sales for the first quarter of 2020. Which of the following visualizations should be used when a time element is present?

Options:

A.

A bubble chart

B.

A line chart

C.

A scatter plot

D.

An infographic

Question 106

An analyst runs a report on a daily basis, and the number of datapoints must be validated before the data can be analyzed. The number of datapoints increases each day by approximately 20% of the total number from the day before. On a given day, the number of datapoints was 8,798. Which of the following should be the total number of datapoints on the next day?

Options:

A.

7,038

B.

9,600

C.

10,600

D.

10,800

Question 107

An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Join

B.

Append

C.

Transform

D.

Blend

Question 108

Which of the following data manipulation techniques is an example of a logical function?

Options:

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Question 109

Which of the following best describes the law of large numbers?

Options:

A.

As a sample size decreases, its standard deviation gets closer to the average of the whole population.

B.

As a sample size grows, its mean gets closer to the average of the whole population

C.

As a sample size decreases, its mean gets closer to the average of the whole population.

D.

When a sample size doubles. the sample is indicative of the whole population.

Question 110

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

Options:

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Question 111

What would be an example of an acceptable form of primary identification for the Data+ exam?

Options:

A.

Passport.

B.

School ID card.

C.

Employee ID card.

D.

Credit card with photo and signature.

Question 112

An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?

Options:

A.

Performance analysis

B.

Exploratory analysis

C.

Link analysis

D.

Trend analysis

Question 113

Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?

    Python

Options:

A.

R

B.

Microsoft Power Bl

C.

SAS

Question 114

A salesperson who is prospecting potential clients collected the following data:

Which of the following is an issue with this data?

Options:

A.

Duplicate data

B.

Invalid data

C.

Missing value

D.

Redundant data

Question 115

Which one of the following values will appear first if they are sorted in descending order?

Options:

A.

Aaron.

B.

Molly.

C.

Xavier.

D.

Adam.

Question 116

A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?

Options:

A.

A real-time monitor that allows the manager to view performance the day the campaign was launched

B.

A sell-service dashboard that allows the manager to look at the company's annual budget performance

C.

A spreadsheet of the raw data from all marketing campaigns and channels

D.

A summary with statistics, conclusions, and recommendations from the data analyst

Question 117

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

Options:

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Question 118

A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?

Options:

A.

R

B.

Excel

C.

Tableau

D.

Python

Page: 1 / 30
Total 396 questions