Variance
1. What is Variance?
Variance is a statistical measure that tells you how much the values in a dataset differ from the mean (average) of that dataset. Variance shows how spread out the numbers are.
- If the variance is low, the numbers are close to the mean.
-
If the variance is high, the numbers are spread out.
\[ \text{Variance} = \sigma^2 = \frac{1}{N} \sum_{i=1}^{N}(x_i – \mu)^2 \]
\[ \mu = \frac{1}{N} \sum_{i=1}^{N} x_i \]
-
\(x_i\): each value in the dataset
-
\(\mu\): population mean
-
\(N\): total number of values in the population
2. What is Sample Variance?
Sample Variance measures how spread out the values are in a sample (a subset of the population). It estimates the true variance of the entire population using just the sample data.
\[ \text{Sample Variance} = s^2 = \frac{1}{n-1} \sum_{i=1}^{n}(x_i – \bar{x})^2 \]
\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]
-
each value in the dataset
-
\(\bar{x}\): sample mean
-
\(n\) :total number of values in the sample
We use \(n – 1\) instead of \(n\) (Bessel’s correction) to correct bias when estimating the population variance from a sample.
3. Bessel’s correction
\[ x_i – \bar{x} = (x_i – \mu) – (\bar{x} – \mu) \]
\[ \Rightarrow (x_i – \bar{x})^2 = \left( (x_i – \mu) – (\bar{x} – \mu) \right)^2 \]
\[ \Rightarrow (x_i – \bar{x})^2 = (x_i – \mu)^2 – 2(x_i – \mu)(\bar{x} – \mu) + (\bar{x} – \mu)^2 \]
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \bar{x})^2 = \sum_{i=1}^{n}(x_i – \mu)^2 – 2(\bar{x} – \mu)\sum_{i=1}^{n}(x_i – \mu) + n(\bar{x} – \mu)^2 ~~~~~ (3.1) \]
Use the sample mean definition, we have:
\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n}x_i \]
\[ \Rightarrow \sum_{i=1}^{n}x_i = n\bar{x} \]
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \mu) = \left(\sum_{i=1}^{n}x_i\right) – n\mu \]
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \mu) = n\bar{x} – n\mu \]
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \mu) = n(\bar{x} – \mu) \]
From (3.1)
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \bar{x})^2 = \sum_{i=1}^{n}(x_i – \mu)^2 – 2n(\bar{x} – \mu)^2 + n(\bar{x} – \mu)^2 \]
\[ \Rightarrow \sum_{i=1}^{n}(x_i – \bar{x})^2 = \sum_{i=1}^{n}(x_i – \mu)^2 – n(\bar{x} – \mu)^2 \]
Based on the properties of expected value, take expectation of both sides:
\[ \Rightarrow \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \bar{x})^2\right] = \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \mu)^2\right] – \mathbb{E}\left[n(\bar{x} – \mu)^2\right] ~~~~~ (3.2) \]
From this post, we have the definition of expected value & variance:
\[ \mathbb{E}[X] = \sum_{i=1}^{n} x_i \cdot P(X = x_i) ~~~~~ (3.3) \]
\[ \text{Var}(X) = \sum_{i=1}^n (x_i – E(X))^2 P(X = x_i) ~~~~~ (3.4) \]
From (3.3), we have:
\[ \Rightarrow \mathbb{E}[g(X)] = \sum_{i=1}^n g(x_i) \cdot P(X = x_i) \]
If we let \(g(X) = (X – \mu)^2\), then:
\[ \Rightarrow \mathbb{E}[(X – \mu)^2] = \sum_{i=1}^n (x_i – \mu)^2 \cdot P(X = x_i) \]
\[ \Rightarrow \text{Var}(X) = \mathbb{E}[(X – \mu)^2] \]
From (3.2), we have
- First term:
\[ \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \mu)^2\right] = \sum_{i=1}^{n} \mathbb{E}[(x_i – \mu)^2] \]
\[ \Rightarrow \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \mu)^2\right] = \sum_{i=1}^{n} \text{Var}(X) \]
\[ \Rightarrow \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \mu)^2\right] = \sum_{i=1}^{n} \sigma^2 = n \cdot \sigma^2 \]
- Second term:
\[ \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \mathbb{E}\left[(\bar{x} – \mu)^2\right] \]
\[ \Rightarrow \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \text{Var}(\bar{x}) \]
Apply (2) we have:
\[ \Rightarrow \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \text{Var}\left(\frac{1}{n} \sum_{i=1}^{n} x_i\right) \]
Based on the property of variance \( \text{Var}(aX) = a^2\text{Var}(X) \) when \(a\) is constant:
\[ \Rightarrow \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \frac{1}{n^2} \text{Var}\left(\sum_{i=1}^{n} x_i\right) \]
\[ \Rightarrow \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \frac{1}{n^2} \sum_{i=1}^{n} \text{Var}(x_i) \]
\[ \Rightarrow \mathbb{E}\left[n(\bar{x} – \mu)^2\right] = n \cdot \frac{1}{n^2} \cdot n \cdot \sigma^2 = \sigma^2 \]
Now, (3.2) can be transformed to:
\[ \Rightarrow \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \bar{x})^2\right] = n \cdot \sigma^2 – \sigma^2 \]
\[ \Rightarrow \mathbb{E}\left[\sum_{i=1}^{n}(x_i – \bar{x})^2\right] = (n – 1) \cdot \sigma^2 \]
\[ \Rightarrow \mathbb{E}\left[\frac{1}{n – 1} \sum_{i=1}^{n}(x_i – \bar{x})^2\right] = \sigma^2 \]
Recent Blogs

Long Short-Term Memory (LSTM)
June 2, 2025

Cosine Similarity
February 16, 2025

Vanishing Gradient
January 24, 2025
