Statistics

So Imagine, for the sake of brevity, that a population has a true parameter $θ$ as a scalar. And we want to make estimates of this parameter. If we observe $n$ samples from this population, then our prediction $\hat{θ_{n}}$ is a function of these samples. $\hat{θ_{n}} = \hat{f} (x_{1}, x_{2}, \dots, x_{n})$ .

Given an assumption that the sample population can replicate the distribution of the total population to some degree, an inference problem deduces the characteristic(s) of the population, given the sample population. What are the methods of statistical inference?

Evaluating estimators:

We can use the mean squared error to evaluate an estimator: $$MSE(\hat{\theta_{n}}) := E[(\hat{\theta}_{n}-\theta)^2]$$
Note that $V a r (X) = E [(X - E [X])^{2}] = E [X^{2}] - (E [X])^{2}$
Now, expanding, we get the following series of equations

\begin{aligned} M S E ({\hat{θ}}_{n}) & = E [{\hat{θ}}_{n}^{2}] + E [θ^{2}] - 2 E [{\hat{θ}}_{n} θ] \\ = V a r ({\hat{θ}}_{n}) + (E [θ_{n}])^{2} + V a r (θ) + (E [θ])^{2} - 2 E [θ] E [{\hat{θ}}_{n}] \\ = V a r ({\hat{θ}}_{n}) + V a r (θ) + (E [θ_{n}] - E [θ])^{2} \\ = V a r ({\hat{θ}}_{n}) + V a r (θ) + | bias ({\hat{θ}}_{n}) |^{2} \end{aligned}

A caveat is that $θ$ is not a distribution but a fixed value so we can cheat with it and factor it out, and just use $E [θ] = θ$ . So it should be that $V a r (θ) = 0$ . The above equation is called bias variance tradeoff.

There are some other stuff, about statistics but I hate it.