You can read the LaTeX document online (for the latest updated chapters) from the link: probability.pdf
Chapter 2: Distribution functions Probability is defined as a set function, and we now review a relevant point function named distribution function.
Monotone functions enjoy many excellent properties which are not reviewed here but applied directly. Let \{a _ i\} be the countable set of jump of F and b _ i the size of jump at a _ i. Consider the function F _ d=\sum _ i b _ i\, \mathbb I\, (x\geqslant a _ i) which represents the sum of all the jumps of F in the half-line (-\infty,x]. It is clearly increasing, right continuous, with F _ d(-\infty)=0, F _ d(+\infty)=\sum _ i b _ i\leqslant1. Hence F _ d is a bounded increasing function. It should constitute the ``jumping part" of F, and if it is subtracted out from F, the remainder should be positive, contain no more jumps, and so be continuous. These plausible statements will not be proved rigorously here. However, they are not really trivial.
Suppose F _ c\neq0, F _ d\neq0 in theorem 2, then we may set \alpha=F _ d(+\infty) so that 0<\alpha<1, F _ 1=\frac1\alpha F _ d, F _ 2=\frac1{1-\alpha}F _ c. Now F _ 1 is a discrete distribution function and F _ 2 is a continuous distribution function. We have
\[
F=\alpha F _ 1+(1-\alpha)F _ 2.
\]
We can see even if \alpha=0 or \alpha=1 we have similar results.
Now some results in real analysis is required here and they are reviewed (in Chinese) in https://gaomj.cn/realanalysis5/.
We put F _ {ac}(x)=\int _ {-\infty}^xF^\prime(t)\, \mathrm dt. From the knowledge of real analysis this definition makes sense, and F _ {ac} is absolutely continuous. We also put F _ s=F-F _ {ac}, then F^\prime _ {ac}=F^\prime\, \text{a.e.} so that F^\prime _ s=0\, \text{a.e.} and F _ s is singular if it is not identically zero.
Remark: Sometimes the density is only defined on those absolute continuous distribution functions. In this case, we always have a nice property of density: the integral of it connects well to the corresponding distribution function. Consider a continuous but not absolute continuous function, Cantor function. It can be extend to become a distribution function, but since its derivative is zero almost everywhere, the integral of it is zero so that we no longer have the nice property. Therefore the sentence ``the random variable has a density" means more often that the distribution function is absolute continuous.
It can be shown that F _ {ac},F _ s are both increasing and less than F. We are now in a position to announce
the following result, which is a refinement of theorem 5.
The proof is omitted here.
2. Relationship between p.m. and d.f.
There is in fact a one-to-one correspondence between the set functions on the one hand, and the point functions on the other. Both points of view are useful in probability theory.
\begin{equation}\label{1}
\mu((-\infty,x])=F(x),\quad\forall x\in\mathbb R.
\end{equation}
As a consequence, we have for -\infty<a<b<+\infty: \mu((a,b])=F(b)-F(a), \mu((a,b))=F(b-)-F(a), \mu([a,b))=F(b-)-F(a-), \mu([a,b])=F(b)-F(a-).
The idea of the proof is simple. Since (-\infty,x]\in\mathcal B, \mu((-\infty,x]) is defined; call it F(x) and so define the function F on \mathbb R. We shall show that F is a distribution function. First, F is increasing. Next, F is right continuous. With the similar technique F(-\infty)=0 and F(+\infty)=1. This ends the verification that F is a distribution function. The corollary of the four relations is also simple to prove.
The reader may refer to measure theory for a complete proof. The following is the basic idea, as an important review. Given distribution function F, we want a set function \mu. We first define \mu for intervals of the form (a,b] by means of \mu((a,b])=F(b)-F(a). Such a function is seen to be countably additive on its current domain of definition. Now we proceed to extend its domain of definition while preserving this additivity. If S is a countable union of such intervals which are disjoint: S=\bigcup _ i(a _ i,b _ i], we are now forced to define \mu(S), if at all, by \mu(S)=\sum _ i\mu((a _ i,b _ i])=\sum _ i[F(b _ i)-F(a _ i)]. But S may be represented in the form above differently, so we have to prove the definition of \mu on S is well-defined, i.e., \mu(S) does not depend on the representation of S. Next, we notice that any open interval (a,b) is in the extended domain, and indeed we have \mu((a,b))=F(b-)-F(a), the same as before. Now it is well known that any open set U in \mathbb R is the union of a countable sequence of disjoint open intervals, say U=\bigcup _ i(c _ i,d _ i); and this representation is unique. Hence again we are forced to define \mu(U), if at all, by \mu(U)=\sum _ i\mu(c _ i,d _ i)=\sum _ i[F(d _ i-)-F(c _ i)]. Now we find that its values of all closed sets are thereby also determined by the property of completion of probability measure. In particular, \mu(\{a\})=F(a)-F(a-). Now we also know the value of \mu on all countable sets, and so on -- all this provided that no contradiction is ever forced on us so far as we have gone. But we are still far from the \sigma-algebra \mathcal B. For the value of \mu on G _ \delta and F _ \sigma sets, there are different ways to achieve our goal but we omit the details here since it is not an easy task.
There is one more question: besides the \mu discussed above is there any other probability measure \nu that corresponds to the given F in the same way? It is important to realize that this question is not answered by the preceding theorem. Clearly we should phrase the question more precisely by considering only probability measures on \mathcal B. This will be answered in full generality by the next theorem.
The proof is omitted here.
We shall simply call \mu the probability measure of F, and F the distribution function of \mu.
Finally, similar to the completion of measure space, we define the completion of probability space.
A set in \mathcal A with probability zero is called a nulled set. Like in measure theory, any probability space (\Omega,\mathcal A,\mathbb P) can be completed according to the next theorem.
The proof may be found in some recourses in measure theory and is not given here. Let's discuss the advantage of completion. Suppose that a certain property, such as the existence of a certain limit, is known to hold outside a certain set N with \mathbb P(N)=0. Then the exact set on which the property fails to hold is a subset of N, not necessarily in \mathcal A, but will be in \overline{\mathcal A} with \overline{\mathbb P}(N)=0. We need the measurability of the exact exceptional set to facilitate certain dispositions, such as defining or redefining a function on it.
发表回复