## Summary

This page covers mean and variance definitions for continuous random variables (with prescribed probability density function) and discrete random variables (with prescribed probability mass function).

## Mean

### Continuous Random Variables

If we have a continuous random variable ${\displaystyle X}$ with a probability density function ${\displaystyle f(x)}$, the mean and variance are given by:

${\displaystyle \mu =E[X]=\int xf(x)dx}$

(where the integral is over the range of x values)

### Discrete Random Variable

The mean of a discrete random variable ${\displaystyle X}$ with discrete values ${\displaystyle x_{i},1\leq i\leq n}$ and a probability mass function ${\displaystyle p_{i}}$ is given by the expression:

${\displaystyle \mu =E[X]=\sum _{i=1}^{n}p_{i}x_{i}}$

Note that by definition, the probability mass function must sum to 1:

${\displaystyle \sum _{i=1}^{n}p_{i}=1}$

If we assume a uniform probability for each value, then the probability mass function of component i is just:

${\displaystyle p_{i}={\frac {1}{n}}}$

## Variance

### Continuous Random Variable

The variance of a continuous random variable is given by:

${\displaystyle \sigma ^{2}=Var(X)=\int (x-\mu )^{2}f(x)dx}$

Note that this can be expanded and simplified,

${\displaystyle Var(X)=\int (x^{2}-2x\mu +\mu ^{2})f(x)dx=\int x^{2}f(x)d-2\mu \int xf(x)dx+\mu ^{2}\int f(x)dx=\int x^{2}f(x)d-2\mu ^{2}+\mu ^{2}=\int x^{2}f(x)dx-\mu ^{2}}$

or just

${\displaystyle \sigma ^{2}=Var(X)=\int x^{2}f(x)dx-\mu ^{2}}$

which is equivalent to saying:

${\displaystyle Var(X)=E[X^{2}]-E[X]^{2}}$

### Discrete Random Variable

If we have a discrete random variable with a probability mass function, the variance is given by:

${\displaystyle \sigma ^{2}=Var(X)=\sum p_{i}(x_{i}-mu)^{2}dx}$

This can be simplified to:

${\displaystyle Var(X)=\sum _{i=1}^{n}p_{i}x_{i}^{2}-\left(\sum _{i=1}^{n}p_{i}x_{i}\right)^{2}}$

where the last term is equal to the mean,

${\displaystyle Var(X)=\sum _{i=1}^{n}p_{i}x_{i}^{2}-\mu }$

## Updating On The Fly

### Updating Discrete Mean

Let's suppose we have n data values, ${\displaystyle x_{i},1\leq i\leq n}$, and we are adding one additional data value ${\displaystyle x_{n+1}}$ and want to compute the effect it has on the mean. For the sake of simplicity we assume that the probability mass function is uniform, so that ${\displaystyle p_{i}={\frac {1}{n}}}$ or ${\displaystyle p_{i}={\frac {1}{n+1}}}$. Then old mean is given by:

${\displaystyle \mu _{old}={\frac {1}{n}}\sum _{i=1}^{n}x_{i}}$

and the new mean is given by:

${\displaystyle \mu _{new}={\frac {1}{n+1}}\sum _{i=1}^{n+1}x_{i}}$

We can re-express the new mean in terms of the old mean by the following relationship:

${\displaystyle \mu _{new}=\left({\frac {n}{n+1}}\right)\mu _{old}+\left({\frac {1}{n+1}}\right)x_{n+1}}$

### Updating Discrete Variance

Supposing the same situation - that we have n discrete data values ${\displaystyle x_{i},1\leq i\leq n}$, and we are adding one additional data value ${\displaystyle x_{n+1}}$ and want to compute the effect it has on the variance. As with updating the mean, we presume a uniform mass density function for the sake of simplicity, ${\displaystyle p_{i}={\frac {1}{n}}}$ or ${\displaystyle p_{i}={\frac {1}{n+1}}}$. Then the old variance is given by:

${\displaystyle Var_{old}(X)=\sum _{i=1}^{n}\left({\frac {x_{i}^{2}}{n}}\right)-\left(\sum _{i=1}^{n}{\frac {x_{i}}{n}}\right)^{2}}$

or,

${\displaystyle Var_{old}(X)={\frac {1}{n}}\sum _{i=1}^{n}x_{i}^{2}-\mu _{old}^{2}}$

(as before, the last term is the old mean; because the probability mass function must sum to 1, the sum of 1/n is the same as the sum of 1/n^2, so the mismatch in the two terms is not a problem). Grouping the first term as SS (sum of squares) for convenience,

${\displaystyle Var_{old}(X)={\frac {1}{n}}SS_{old}-\mu _{old}^{2}}$

Now we can write out the new expression for the variance as:

${\displaystyle Var_{new}(X)={\frac {1}{n+1}}\sum _{i=1}^{n+1}\left(x_{i}^{2}\right)-\mu _{new}^{2}}$

which can be written in terms of the old sum of squares term as:

${\displaystyle Var_{new}(X)={\frac {n}{n+1}}SS_{old}+{\frac {1}{n}}x_{n+1}^{2}-\mu _{new}^{2}}$

Thus, if we want to start with a set of n points, and quickly evaluate the effect that adding one additional point would have (say, from a pool of possible "next choices"), we can implement the above formulas for a one-shot update.