Exploratory Data Analysis (EDA) — Quiz | Data Analytics using Python

Question 1

Which method gives a fast statistical summary of all numeric columns?

A df.info()

B df.describe()

C df.head()

D df.columns

Question 2

Mean is much larger than median and skew is positive. This means…

A The data is symmetric

B A few large values pull the average up (right-skew)

C There are no outliers

D The median is wrong

Question 3

In skewed data, which is the more honest 'typical' value?

A the mean

B the median

C the maximum

D the standard deviation

Question 4

A boxplot's box spans which range?

A min to max

B Q1 to Q3 (the middle 50%)

C mean ± 1 std

D 0 to the median

Question 5

The IQR rule flags an outlier as any value…

A above the mean

B below Q1 − 1.5×IQR or above Q3 + 1.5×IQR

C equal to the median

D in the top 10

Question 6

The Z-score method is best suited to data that is…

A heavily skewed

B roughly normal (bell-shaped)

C all text

D missing

Question 7

What should you do with a detected outlier?

A Always delete it

B Investigate it before deciding

C Ignore the dataset

D Replace it with the mean automatically

Question 8

A correlation of −0.55 between discount and profit means…

A they are unrelated

B higher discount tends to come with lower profit

C discount causes profit

D profit causes discount

Question 9

Correlation does NOT imply…

A a relationship

B causation

C a number between −1 and 1

D anything measurable

Question 10

Which compares a numeric metric across categories?

A df.corr()

B df.groupby('segment')['profit'].mean()

C df.describe()

D df.head()

Question 11

What does ydata-profiling do?

A Cleans data automatically

B Generates a full automated EDA report

C Builds machine-learning models

D Connects to databases

Question 12

The best structure for communicating an EDA finding is…

A just show the chart

B Question → Finding → Evidence → Recommendation

C list every statistic

D a single number