Exploratory Data Analysis (EDA) — Quiz

Answer all 12 questions, then submit. You need 70% to pass. Log in to save progress.

Question 1
Which method gives a fast statistical summary of all numeric columns?
A df.info()
B df.describe()
C df.head()
D df.columns
Question 2
Mean is much larger than median and skew is positive. This means…
A The data is symmetric
B A few large values pull the average up (right-skew)
C There are no outliers
D The median is wrong
Question 3
In skewed data, which is the more honest 'typical' value?
A the mean
B the median
C the maximum
D the standard deviation
Question 4
A boxplot's box spans which range?
A min to max
B Q1 to Q3 (the middle 50%)
C mean ± 1 std
D 0 to the median
Question 5
The IQR rule flags an outlier as any value…
A above the mean
B below Q1 − 1.5×IQR or above Q3 + 1.5×IQR
C equal to the median
D in the top 10
Question 6
The Z-score method is best suited to data that is…
A heavily skewed
B roughly normal (bell-shaped)
C all text
D missing
Question 7
What should you do with a detected outlier?
A Always delete it
B Investigate it before deciding
C Ignore the dataset
D Replace it with the mean automatically
Question 8
A correlation of −0.55 between discount and profit means…
A they are unrelated
B higher discount tends to come with lower profit
C discount causes profit
D profit causes discount
Question 9
Correlation does NOT imply…
A a relationship
B causation
C a number between −1 and 1
D anything measurable
Question 10
Which compares a numeric metric across categories?
A df.corr()
B df.groupby('segment')['profit'].mean()
C df.describe()
D df.head()
Question 11
What does ydata-profiling do?
A Cleans data automatically
B Generates a full automated EDA report
C Builds machine-learning models
D Connects to databases
Question 12
The best structure for communicating an EDA finding is…
A just show the chart
B Question → Finding → Evidence → Recommendation
C list every statistic
D a single number