6 - Probability - Continuous RV

Support/Figures/Pasted image 20250115225119.png
We can use the same ideas, and do the syntactic analogies from sums to integrals.
Support/Figures/Pasted image 20250115231659.png

Here, the fX are called probability density functions.
We have for example the uniform distribution, which looks like fX(x)=1ba if axb else 0.
In this case E[X]=abxbadx=b+a2. Which is what we expect, if everything is uniform, we are likely to pick the middle value. and σX2=(ba)212.
Support/Figures/Pasted image 20250115235614.png
Support/Figures/Pasted image 20250115235741.png
So the cumulative distribution function, at x is the probability that the random variable X outputs something at most x. This is nice because they give some unified way of reasoning with both discrete and continuous random variables.
Support/Figures/Pasted image 20250116000457.png
Support/Figures/Pasted image 20250116000442.png
Support/Figures/Pasted image 20250116000805.png
Support/Figures/Pasted image 20250116000820.png
Support/Figures/Pasted image 20250116000919.png
Support/Figures/Pasted image 20250116001226.png
Let us dive deeper into bayes and conditioning.
Suppose we have a random variable X, and we know it's distribution pX(x) when discrete (pmf) or fX(x) when continuous (pdf). So the idea is that we know the DISTRIBUTION of X, which goes through some blackbox that is kind noisy, and we get a random variable Y, the thing is, we can observe the values of Y only. Our model for the blackbox is the distribution Y|X, that is what is the probability (mass or density) of observing any y given some x
Now, what is the inference problem here?
Observing some yY collapses the real world into a state where we have observed this y. Given this observation, what was the likely state of X? that is, we want to model the distribution X|Y.
having observed some y, we want to know the probability that any x went into the blackbox.

In another way, having observed Y, we want to make inferences about X.

In particular, if we observe some x and then some y, then it is same as observing the pair (x,y), which is the same as observing some y and then some x.

allowing h to denote either probability mass or density (depending on discrete/continous)
hX(x)hY|X(y|x)=hX,Y(x,y)=hY(y)hX|Y(x|y)

Hence hX|Y(x|y)=hX(x)hY|X(y|x)hY(y) (the bayes rule)
for example let us say X is a signal, and we know its distribution fX(x) (often called "prior"), and it goes through some noisy blackbox, which gives us an observable random variable Y, moreover, let us say we have a model of the noise fY|X(y|x).

From this information, we can calculate the marginal distribution hY(y)=xh(x)hY|X(y|x)dx.
And then using the bayes rule, we can infer the distribution hX|Y(x|y).
Support/Figures/Pasted image 20250116005119.png