Chapter 17: Basic Statistical Models

Linear Regression

For a bivariate data set $(x_{1},y_{1}),(x_{2},y_{2}),\dots ,(x_{n},y_{n})$ :

Assume that $x_{1},x_{2},\dots x_{n}$ are not random

$y_{1},y_{2},\dots ,y_{n}$ are realizations of random variables $Y_{1},Y_{2},\dots ,Y_{n}$ that satisfy

$Y_{i}=\alpha +\beta x_{i}+U_{i}$

for $i=1\dots n$

where $U_{i}$ are independent random variables with $E(U_{i})=0$ (because random fluctuations, expected to be zero about the regression line) and $Var(U_{i})=\sigma ^{2}$ (each point has same variance, because assuming each random fluctuation has same amount of variability)

Expectation of each $Y_{i}$ is different:

$E[Y_{i}]=E[\alpha +\beta x_{i}+U_{i}]=\alpha +\beta x_{i}+E[U_{i}]=\alpha +\beta x_{i}$

Multiple Linear Regression

If we considered that the data were better matched by a function like

$y=\alpha +\beta x+\gamma x^{2}$

then it's no longer linear regression, it's multiple linear regression

Chapter 20

Mean Squared Error (MSE)

Discussing unbiased estimators

Comparison of two unbiased estimators:

1. Variance (spread): less spread means better estimator

2. The lower the spread, the lower the MSE, the better the estimator

A Modern Introduction to Probability and Statistics

From charlesreid1

Contents

Chapter 17: Basic Statistical Models

Linear Regression

Multiple Linear Regression

Chapter 20

Mean Squared Error (MSE)