Special Summer Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Free and Premium CompTIA DA0-001 Dumps Questions Answers

Page: 1 / 27
Total 363 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Which of the following is an example of a at flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Buy Now
Question 2

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 3

Which of the following is the best technique for transferring data from one database to another with some data manipulation?

Options:

A.

Application programming interfaces

B.

Delta load

C.

Extract, transform, load

D.

Export/import

Question 4

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

Options:

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Question 5

A salesperson who is prospecting potential clients collected the following data:

Which of the following is an issue with this data?

Options:

A.

Duplicate data

B.

Invalid data

C.

Missing value

D.

Redundant data

Question 6

The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:

* County outages

* Status

* Overall trend of outages

INSTRUCTIONS:

Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.

If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.

Options:

Question 7

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 8

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 9

A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?

Options:

A.

Delete all incorrect inputs and upload the corrected file.

B.

Have the user manually review the file for data completeness before loading it

C.

Create a data field to data type validator to run the file through prior to import.

D.

Spot-check the file prior to import to catch and correct field errors.

Question 10

Which of the following is a KPI metric for tracking sales performance?

Options:

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Question 11

Given the following data:

Which of the following BEST describes the data set?

Options:

A.

There is data bias.

B.

The data is incomplete.

C.

The data is inconsistent.

D.

The data is outliers.

Question 12

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

Options:

A.

90

B.

60

C.

70

D.

80

Question 13

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 14

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

Options:

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Question 15

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 16

Given the table below:

Which of the following variable types BEST describes the “Year” column?

Options:

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Question 17

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

Options:

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Question 18

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 19

An analyst is currently working on a ticket for revamping a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views that are tailored toward each specific team.

D.

Develop a more streanMined dashboard to roll out by the next delivery date.

Question 20

Which of the following is an example of PII?

Options:

A.

Age

B.

Name

C.

Ethnicity

D.

Gender

Question 21

An analyst develops an IT document and needs to describe the technical terms used in the document. Which of the following is where the analyst should include descriptions of the technical terms?

Options:

A.

Glossary

B.

System diagram

C.

User requirements

D.

Index

Question 22

Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.

Which of the following systems is the most appropriate?

Options:

A.

OLTP.

B.

OLAP.

C.

Data warehouse.

D.

Data mart.

Question 23

Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?

Options:

A.

Foreign key

B.

Function

C.

Stored procedure

D.

Clustered index

Question 24

Which of the following is used for calculations and pivot tables?

Options:

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Question 25

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 26

‘Which of the following is the BEST reason to use database views instead of tables?

Options:

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Question 27

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 28

An analyst needs to summarize the number of people in Chicago in 2022 using the following set of data:

Which of the following steps should the analyst use to provide results? (Select two).

Options:

A.

Aggregation

B.

Sorting

C.

Filtering

D.

Indexing

E.

Cleaning

F.

Replacing

Question 29

The ACME Corporation hired an analyst to detect data quality issues in their Excel documents. Which of the following are the most common issues? (Select TWO)

Options:

A.

Apostrophe.

B.

Commas.

C.

Symbols.

D.

Duplicates.

E.

Misspellings.

Question 30

Which of the following reports can be used when insight into operational performance is needed each Wednesday?

Options:

A.

Static report

B.

Tactical report

C.

Recurring report

D.

Ad hoc report

Question 31

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

Options:

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Question 32

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 33

A sales director has requested a report for individual team members within the division be developed. The director would like the report to be shared with all team members, but individual team members should not be identifiable within the report Which of the following access requirements would support the director's needs?

Options:

A.

Create an acceptable use policy for the sales data.

B.

Release the report as user-group-based access and include data masking.

C.

Get a data use agreement from the individual team members.

D.

Provide the report based on role and include data encryption.

Question 34

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 35

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

Options:

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Question 36

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 37

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

Options:

A.

SAS

B.

SQL

C.

Python

D.

R

Question 38

Which of the following techniques is used to quantify data?

Options:

A.

Decoding

B.

Enumeration

C.

Coding

D.

Structure

Question 39

Which of the following is the best description of the term "data governance"?

Options:

A.

Data governance governs the development of a data visualization dashboard in an organization.

B.

Data governance is the policy that protects against data breaches by cybercriminals.

C.

Data governance is the process of analyzing, manipulating, and reporting data in an organization.

D.

Data governance is the availability, usability, integrity, and security of data in an enterprise.

Question 40

Which of the following is most likely to be used as a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 41

Which of the following is the most appropriate to consider when creating a schema of a central group broken into detailed subcategories?

Options:

A.

Relational

B.

Hierarchical

C.

Snowflake

D.

Star

Question 42

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

Options:

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Question 43

Given the diagram below:

Which of the following steps is missing?

Options:

A.

Remove redundant data.

B.

Validate the data types.

C.

Connect to the data API.

D.

Normalize the data.

Question 44

An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).

Options:

A.

Data formatting

B.

Data accuracy

C.

Data maturity

D.

Data field

E.

Data completeness

F.

Data consistency

G.

Data diversity

Question 45

Given the following table:

Which of the following describes the data quality issues with theagedata?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Question 46

An analyst is currently working on a ticket to revamp a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views tailored toward each specific team.

D.

Develop a more streamlined dashboard to roll out by the next delivery date.

Question 47

A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?

Options:

A.

Data content

B.

Frequency

C.

Filtering

D.

Views

Question 48

Which of the following data manipulation techniques is an example of a logical function?

Options:

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Question 49

An analyst needs to know what data an organization possesses. Which of the following is the best document for the analyst to consult?

Options:

A.

Data destruction policy

B.

Data use document

C.

Data dictionary

D.

Data retention policy

Question 50

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 51

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to best display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chart using the site and percentage of new custorners data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 52

Which of the following is the median of the number set:3, 7, 5, 6, 9?

Options:

A.

5

B.

6

C.

7

D.

9

Question 53

Given the table below:

Which of the following boxes indicates that a Type Il error has occurred?

Options:

A.

1

B.

2

C.

3

D.

4

Question 54

What role in a data governance is typically responsible for day-to-day oversight of data use?

Options:

A.

Data processors.

B.

Data custodians

C.

Data owners.

D.

Data stewards.

Question 55

An analyst has been tracking company intranet usage and has been asked to create a chat to show the most-used/most-clicked portions of a homepage that contains more than 30 links. Which of the following visualizations would BEST illustrate this information?

Options:

A.

Scatter plot

B.

Heat map

C.

Pie chart

D.

Infographic

Question 56

A JSON file is an example of:

Options:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Question 57

A financial analyst is creating a daily billing report for a company. One night, the company's data warehouse did not update the data, which caused the data to be reported incorrectly the next day. Which of the following documentation elements should the analyst add to catch this error?

Options:

A.

Version number

B.

Data refresh

C.

Frequently asked questions tab

D.

Summary

Question 58

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

Options:

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Question 59

Which of the following occurs if a 90% confidence interval increases to 95%?

Options:

A.

The margin of error does not change.

B.

The interval remains the same.

C.

The interval becomes narrower.

D.

The margin of error doubles.

Question 60

A dataset requires an analysis for investigating and discovering abnormalities. Which of the following best describes the nature of the exploratory analysis conducted?

Options:

A.

Summary of the data's main characteristics

B.

Best data tuning method

C.

Set of methods for cleaning the data

D.

Method of checking the quality of the data

Question 61

A data analyst needs to create a weekly recurring report on sales performance and distribute it to all sales managers. Which of the following would be the BEST method to automate and ensure successful delivery for this task?

Options:

A.

Use scheduled report delivery.

B.

Implement subscription access delivery.

C.

Print out a copy.

D.

Upload the report to the server.

Question 62

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

Options:

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Question 63

An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?

Options:

A.

The date of the dashboard build

B.

The data refresh date

C.

A report summary

D.

Frequently asked questions

Question 64

A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory data analysis

Question 65

An analyst needs to join two data sets that compare vehicle weights. One data set is in pounds, and the other has various units of measure. Which of the following should the analyst do first to the data prior to any type of join?

Options:

A.

Blend

B.

Reduce

C.

Concatenate

D.

Normalize

Question 66

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 67

Joseph is interpreting a left skewed distribution of test scores. Joe scored at the mean, Alfonso scored at the median, and gaby scored and the end of the tail.

Who had the highest score?

Options:

A.

Joseph

B.

Joe

C.

Alfonso

D.

Gaby

Question 68

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

Options:

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Question 69

An analyst wants to combine two data sets into a single spreadsheet. Column names from the first spreadsheet are listed in rows in the second spreadsheet. Which of the following is the first step the analyst should take to combine the data sets?

Options:

A.

Blend

B.

Merge

C.

Concatenate

D.

Transpose

Question 70

An analyst runs a report on a daily basis, and the number of datapoints must be validated before the data can be analyzed. The number of datapoints increases each day by approximately 20% of the total number from the day before. On a given day, the number of datapoints was 8,798. Which of the following should be the total number of datapoints on the next day?

Options:

A.

7,038

B.

9,600

C.

10,600

D.

10,800

Question 71

Given the following data set:

Which of the following is the best reason for cleansing the data?

Options:

A.

Duplicate data

B.

Imputed data

C.

Redundant data

D.

Corrupt data

Question 72

A data analyst who works for a government agency is required to obtain the average income of citizens. The list of citizens is given in the following table:

A value for one citizen's income is missing. Which of the following approaches should the data analyst take to solve this issue?

Options:

A.

Replace the missing value with the average of the rest of the unemployed citizens.

B.

Insert the value 0 into the field with the missing value.

C.

Impute the mean of the other citizens' incomes into the field with the missing value.

D.

Exclude employed citizens from the analysis.

Question 73

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 74

A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?

Options:

A.

Systematic sampling

B.

Convenience sampling

C.

Stratified sampling

D.

Random sampling

Question 75

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 76

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 77

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 78

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 79

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

Options:

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Question 80

Which of the following file formats is best suited to start exploratory analysis within statistical software?

Options:

A.

CSV

B.

XLSM

C.

XML

D.

JSON

Question 81

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

Options:

A.

1.16

B.

6

C.

36

D.

72

Question 82

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 83

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

Options:

A.

Four

B.

Five

C.

Six

D.

Seven

Question 84

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 85

A data analyst received the information in the table below from a recently completed marketing campaign:

Which of the following is the total order conversion rate?

Options:

A.

13.2%

B.

14.8%

C.

22.3%

D.

85.2%

Question 86

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

Options:

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Question 87

Which of the following is concatenate typically used to combine?

Options:

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Question 88

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

Options:

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Question 89

A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?

Options:

A.

A self-serve dashboard of website performance that updates in real time

B.

A weekly log report of site visits and user actions

C.

A portal that is refreshed daily and reports errors classified by type

D.

A daily summary email indicating website outages for the previous day

Question 90

What R package makes it easy to work with dates?

Options:

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Question 91

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

Options:

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Question 92

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 93

You are working with a dataset and want to change the names of categories that you used fordifferent types of books.

What term best describes this action?

Options:

A.

Recording.

B.

Summarizing

C.

Aggregating.

D.

Filtering.

Question 94

Which of the following statements would be used to append two tables that have the same number of columns?

Options:

A.

UNION ALL

B.

MERGE

C.

GROUP BY

D.

JOIN

Question 95

Which of the following is a difference between a primary key and a unique key?

Options:

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Question 96

Given the image below:

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Question 97

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to BEST display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chat using the site and percentage of new customers data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 98

Which of the following types of analyses should be used to evaluate the connections and anomalies in a data set when either known patterns are being violated or new patterns are emerging?

Options:

A.

Correlation

B.

Descriptive

C.

Graph

D.

Regression

Question 99

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 100

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 101

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

Options:

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Question 102

Which of the following is the correct data type for text?

Options:

A.

Boolean

B.

String

C.

Integer

D.

Float

Question 103

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 104

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power B1

C.

IBM SPSS

D.

Python

Question 105

An analyst reviews the following data:

7

3

5

2

3

7

7

10

Which of the following is the value of the mode?

Options:

A.

3

B.

5

C.

7

D.

10

Question 106

A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?

Options:

A.

A stratified phone survey of 100 people that is conducted between 2:00 p.m. and 3:00 p.m.

B.

A systematic survey that is sent to 100 single-family homes in the county

C.

Surveys sent to ten randomly selected homes within 5mi (8km) of the county’s office

D.

Surveys sent to 100 randomly selected homes that are reflective of the population

Question 107

An analyst reviews the following table:

Which of the following data types is represented in the values in the RefNo column?

Options:

A.

Numeric

B.

Real Number

C.

Currency

D.

Alphanumeric

Question 108

A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory analysis

Page: 1 / 27
Total 363 questions