From charlesreid1

Chapter 17: Basic Statistical Models

Linear Regression

For a bivariate data set Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle (x_1, y_1), (x_2, y_2), \dots, (x_n, y_n)} :

Assume that are not random

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle y_1, y_2, \dots, y_n} are realizations of random variables Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Y_1, Y_2, \dots, Y_n} that satisfy

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Y_i = \alpha + \beta x_i + U_i }

for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle i=1 \dots n}

where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle U_i} are independent random variables with Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle E(U_i) = 0} (because random fluctuations, expected to be zero about the regression line) and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(U_i) = \sigma^2} (each point has same variance, because assuming each random fluctuation has same amount of variability)

Expectation of each Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Y_i} is different:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle E[Y_i] = E[\alpha + \beta x_i + U_i] = \alpha + \beta x_i + E[U_i] = \alpha + \beta x_i }

Multiple Linear Regression

If we considered that the data were better matched by a function like

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle y = \alpha + \beta x + \gamma x^2 }

then it's no longer linear regression, it's multiple linear regression

Chapter 20

Mean Squared Error (MSE)

Discussing unbiased estimators

Comparison of two unbiased estimators:

1. Variance (spread): less spread means better estimator

2. The lower the spread, the lower the MSE, the better the estimator