MeanAndVariance
From charlesreid1
Contents
Summary
This page covers mean and variance definitions for continuous random variables (with prescribed probability density function) and discrete random variables (with prescribed probability mass function).
Mean
Continuous Random Variables
If we have a continuous random variable Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle X} with a probability density function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle f(x)} , the mean and variance are given by:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \mu = E[X] = \int x f(x) dx }
(where the integral is over the range of x values)
Discrete Random Variable
The mean of a discrete random variable with discrete values and a probability mass function is given by the expression:
Note that by definition, the probability mass function must sum to 1:
If we assume a uniform probability for each value, then the probability mass function of component i is just:
Variance
Continuous Random Variable
The variance of a continuous random variable is given by:
Note that this can be expanded and simplified,
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(X) = \int (x^2 - 2 x \mu + \mu^2) f(x) dx = \int x^2 f(x) d - 2 \mu \int x f(x) dx + \mu^2 \int f(x) dx = \int x^2 f(x) d - 2 \mu^2 + \mu^2 = \int x^2 f(x) dx - \mu^2 }
or just
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \sigma^2 = Var(X) = \int x^2 f(x) dx - \mu^2 }
which is equivalent to saying:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(X) = E[X^2] - E[X]^2 }
Discrete Random Variable
If we have a discrete random variable with a probability mass function, the variance is given by:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \sigma^2 = Var(X) = \sum p_i (x_i - mu)^2 dx }
This can be simplified to:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(X) = \sum_{i=1}^{n} p_i x_i^2 - \left( \sum_{i=1}^{n} p_i x_i \right)^2 }
where the last term is equal to the mean,
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(X) = \sum_{i=1}^{n} p_i x_i^2 - \mu }
Updating On The Fly
Updating Discrete Mean
Let's suppose we have n data values, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle x_i, 1 \leq i \leq n} , and we are adding one additional data value Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle x_{n+1}} and want to compute the effect it has on the mean. For the sake of simplicity we assume that the probability mass function is uniform, so that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle p_i = \frac{1}{n}} or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle p_i = \frac{1}{n+1}} . Then old mean is given by:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \mu_{old} = \frac{1}{n} \sum_{i=1}^{n} x_i }
and the new mean is given by:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \mu_{new} = \frac{1}{n+1} \sum_{i=1}^{n+1} x_i }
We can re-express the new mean in terms of the old mean by the following relationship:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \mu_{new} = \left( \frac{n}{n+1} \right) \mu_{old} + \left( \frac{1}{n+1} \right) x_{n+1} }
Updating Discrete Variance
Supposing the same situation - that we have n discrete data values Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle x_i, 1 \leq i \leq n} , and we are adding one additional data value Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle x_{n+1}} and want to compute the effect it has on the variance. As with updating the mean, we presume a uniform mass density function for the sake of simplicity, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle p_i = \frac{1}{n}} or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle p_i = \frac{1}{n+1}} . Then the old variance is given by:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var_{old}(X) = \sum_{i=1}^{n} \left( \frac{x_i^2}{n} \right) - \left( \sum_{i=1}^{n} \frac{x_i}{n} \right)^2 }
or,
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var_{old}(X) = \frac{1}{n} \sum_{i=1}^{n} x_i^2 - \mu_{old}^2 }
(as before, the last term is the old mean; because the probability mass function must sum to 1, the sum of 1/n is the same as the sum of 1/n^2, so the mismatch in the two terms is not a problem). Grouping the first term as SS (sum of squares) for convenience,
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var_{old}(X) = \frac{1}{n} SS_{old} - \mu_{old}^2 }
Now we can write out the new expression for the variance as:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var_{new}(X) = \frac{1}{n+1} \sum_{i=1}^{n+1} \left( x_i^2 \right) - \mu_{new}^2 }
which can be written in terms of the old sum of squares term as:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var_{new}(X) = \frac{n}{n+1} SS_{old} + \frac{1}{n} x_{n+1}^2 - \mu_{new}^2 }
Thus, if we want to start with a set of n points, and quickly evaluate the effect that adding one additional point would have (say, from a pool of possible "next choices"), we can implement the above formulas for a one-shot update.
| Computer Science notes on computer science topics on the wiki, for educational and learning purposes
Part of the 2017 CS Study Plan.
Python/Exceptions · Python/Assertions · Python/Decorators Python/Os (os module) · Python/Strings Python/Splat · Python/Iterators · Python/Generators Python/Comparators · Python/Lambdas
Builtin features of Java: Java/Exceptions · Java/Assertions · Java/Memory · Java/Interfaces Java/Generics · Java/Decorators · Java/Diamond Notation Java/Iterators · Java/Iterable · Iterators vs Iterable Java/Comparators · Java/Comparable · Comparators vs Comparable Java/Numeric · Java/TypeChecking · Java/Testing · Java/Timing · Java/Profiling Documentation: Javadocs · Java/Documentation Tools and functionality: Java/URLs · Java/CSV External libraries: Guava · Fastutil · Eclipse Collections OOP: OOP Checklist · Java/Abstract Class · Java/Encapsulation · Java/Generics
|
See also: