Mean deviation uses absolute values, which are awkward to handle algebraically. Statisticians prefer to square the deviations instead — squaring also removes signs, but produces a smooth, differentiable measure that behaves beautifully under further analysis. This leads to the two most important measures of spread.
Variance $\sigma^2$ is the mean of the squared deviations from the mean. For raw (ungrouped) data of $n$ values:
$$\sigma^2=\dfrac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^2$$
Standard deviation $\sigma$ is the positive square root of the variance. The square root brings the measure back to the same units as the original data, which is why it is the quantity actually quoted:
$$\sigma=\sqrt{\dfrac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^2}$$
For a frequency distribution (discrete or continuous, with class marks $x_i$, frequencies $f_i$, and $N=\sum f_i$):
$$\sigma=\sqrt{\dfrac{1}{N}\sum f_i(x_i-\bar{x})^2}$$
Expanding the square gives an equivalent computational form that avoids first finding $\bar{x}$ and subtracting it from every value:
$$\sigma^2=\dfrac{1}{N}\sum f_i x_i^2-\left(\dfrac{\sum f_i x_i}{N}\right)^2$$
When the numbers are large, the step-deviation (short-cut) method rescales the data first. Choose an assumed mean $A$ and class width $h$, and set $y_i=\dfrac{x_i-A}{h}$. Then:
$$\sigma=h\sqrt{\dfrac{1}{N}\sum f_i y_i^2-\left(\dfrac{\sum f_i y_i}{N}\right)^2}$$
The table below contrasts the data types and the form you would use:
Deeper Insight — why we square instead of taking absolute values: Squaring deviations does the same sign-removing job as the absolute value, but it earns three decisive advantages that make variance the cornerstone of all statistics. First, the function $\sum(x_i-a)^2$ is smooth everywhere — it has a derivative at every point — whereas $\sum|x_i-a|$ has sharp corners; this smoothness lets us minimise it cleanly and prove that the minimising value is exactly the mean $\bar{x}$. Second, squaring deliberately gives more weight to large deviations: an observation twice as far from the mean contributes four times as much, so the measure is sensitive to outliers in a controlled, predictable way. Third, variances add for independent quantities, a property absolute deviations simply do not have, and that additivity is what makes the standard deviation the natural scale for the normal distribution, error analysis and the entire edifice of inferential statistics you will meet later. Taking the square root at the end is not cosmetic — it restores the original units (rupees, kilograms, marks), so a standard deviation is something you can actually interpret on the same axis as the data itself.
Find the variance and standard deviation of: $2, 4, 6, 8, 10$.
Solution- $n=5$. Mean $\bar{x}=\dfrac{2+4+6+8+10}{5}=\dfrac{30}{5}=6$.
- Deviations $(x_i-6)$: $-4, -2, 0, 2, 4$. Squares: $16, 4, 0, 4, 16$.
- $\sum(x_i-\bar{x})^2 = 16+4+0+4+16=40$.
- Variance $\sigma^2=\dfrac{40}{5}=8$.
- Standard deviation $\sigma=\sqrt{8}=2\sqrt{2}\approx 2.83$.
Answer: $\sigma^2=8$, $\sigma=2\sqrt{2}\approx 2.83$.
Find the standard deviation of the first $n$ natural numbers, then evaluate it for $n=10$.
Solution- Using $\sum x_i^2=\dfrac{n(n+1)(2n+1)}{6}$ and $\bar{x}=\dfrac{n+1}{2}$.
- $\sigma^2=\dfrac{1}{n}\sum x_i^2-\bar{x}^2=\dfrac{(n+1)(2n+1)}{6}-\dfrac{(n+1)^2}{4}=\dfrac{n^2-1}{12}$.
- So $\sigma=\sqrt{\dfrac{n^2-1}{12}}$.
- For $n=10$: $\sigma^2=\dfrac{100-1}{12}=\dfrac{99}{12}=8.25$, so $\sigma=\sqrt{8.25}\approx 2.87$.
Answer: $\sigma=\sqrt{\dfrac{n^2-1}{12}}$; for $n=10$, $\sigma\approx 2.87$.
Find the variance of the discrete distribution: $x_i = 4, 8, 11, 17, 20$ with frequencies $f_i = 3, 5, 9, 5, 3$.
Solution- $N=\sum f_i=3+5+9+5+3=25$.
- $\sum f_i x_i = 4(3)+8(5)+11(9)+17(5)+20(3)=12+40+99+85+60=296$, so $\bar{x}=\dfrac{296}{25}=11.84$.
- $\sum f_i x_i^2 = 16(3)+64(5)+121(9)+289(5)+400(3)=48+320+1089+1445+1200=4102$.
- $\sigma^2=\dfrac{\sum f_i x_i^2}{N}-\bar{x}^2=\dfrac{4102}{25}-(11.84)^2=164.08-140.1856=23.8944$.
Answer: Variance $\sigma^2\approx 23.89$ (and $\sigma\approx 4.89$).
Find the standard deviation for the continuous distribution: classes $0\text{-}10, 10\text{-}20, 20\text{-}30, 30\text{-}40, 40\text{-}50$ with frequencies $5, 8, 15, 16, 6$.
Solution- Class marks $x_i$: $5, 15, 25, 35, 45$; $N=50$.
- $\sum f_i x_i = 25+120+375+560+270=1350$, so $\bar{x}=\dfrac{1350}{50}=27$.
- $\sum f_i x_i^2 = 25(5)+225(8)+625(15)+1225(16)+2025(6)=125+1800+9375+19600+12150=43050$.
- $\sigma^2=\dfrac{43050}{50}-27^2=861-729=132$.
- $\sigma=\sqrt{132}\approx 11.49$.
Answer: $\sigma^2=132$, $\sigma=\sqrt{132}\approx 11.49$.
Using the step-deviation method, find the standard deviation: classes $0\text{-}10, 10\text{-}20, 20\text{-}30, 30\text{-}40, 40\text{-}50$ with frequencies $5, 8, 15, 16, 6$. (Take $A=25$, $h=10$.)
Solution- Class marks $x_i$: $5, 15, 25, 35, 45$. With $A=25$, $h=10$, $y_i=\dfrac{x_i-25}{10}$ gives $-2, -1, 0, 1, 2$; $N=50$.
- $\sum f_i y_i = (-2)(5)+(-1)(8)+0(15)+1(16)+2(6)=-10-8+0+16+12=10$.
- $\sum f_i y_i^2 = 4(5)+1(8)+0(15)+1(16)+4(6)=20+8+0+16+24=68$.
- $\sigma=h\sqrt{\dfrac{\sum f_i y_i^2}{N}-\left(\dfrac{\sum f_i y_i}{N}\right)^2}=10\sqrt{\dfrac{68}{50}-\left(\dfrac{10}{50}\right)^2}$.
- $=10\sqrt{1.36-0.04}=10\sqrt{1.32}\approx 10(1.1489)\approx 11.49$.
Answer: $\sigma\approx 11.49$ — matching the direct method of Example 4.
The mean of $5$ observations is $4.4$ and their variance is $8.24$. If three of the observations are $1, 2, 6$, find the other two.
Solution- Let the unknowns be $a$ and $b$. Mean: $\dfrac{1+2+6+a+b}{5}=4.4\Rightarrow 9+a+b=22\Rightarrow a+b=13$.
- Variance: $\dfrac{1}{5}\sum x_i^2-\bar{x}^2=8.24\Rightarrow \dfrac{1}{5}\sum x_i^2=8.24+19.36=27.6$, so $\sum x_i^2=138$.
- $1^2+2^2+6^2+a^2+b^2=138\Rightarrow 41+a^2+b^2=138\Rightarrow a^2+b^2=97$.
- From $a+b=13$: $a^2+b^2=(a+b)^2-2ab=169-2ab=97\Rightarrow ab=36$.
- Solving $a+b=13$, $ab=36$ gives $a=4$, $b=9$ (roots of $t^2-13t+36=0$).
Answer: The other two observations are $4$ and $9$.