Distribution functions

You can read the LaTeX document online (for the latest updated chapters) from the link: probability.pdf

Chapter 2: Distribution functions Probability is defined as a set function, and we now review a relevant point function named distribution function.

1. Definition

Definition 1. A function F:\mathbb R\to\mathbb R that is increasing and right continuous with F(-\infty)=0, F(+\infty)=1 is called a distribution function.

Monotone functions enjoy many excellent properties which are not reviewed here but applied directly. Let \{a _ i\} be the countable set of jump of F and b _ i the size of jump at a _ i. Consider the function F _ d=\sum _ i b _ i\, \mathbb I\, (x\geqslant a _ i) which represents the sum of all the jumps of F in the half-line (-\infty,x]. It is clearly increasing, right continuous, with F _ d(-\infty)=0, F _ d(+\infty)=\sum _ i b _ i\leqslant1. Hence F _ d is a bounded increasing function. It should constitute the ``jumping part" of F, and if it is subtracted out from F, the remainder should be positive, contain no more jumps, and so be continuous. These plausible statements will not be proved rigorously here. However, they are not really trivial.

Theorem 2. Let F _ c=F-F _ d, then F _ c is positive, increasing and continuous.

Theorem 3. If F _ d has the form mentioned above, then the decomposition F=F _ c+F _ d is unique.

Definition 4. A distribution function F of the form \sum _ i b _ i\, \mathbb I\, (x\geqslant a _ i) where additionally \sum b _ i=1, is called a discrete distribution function. A distribution that is continuous everywhere is called a continuous distribution function.

Suppose F _ c\neq0, F _ d\neq0 in theorem 2, then we may set \alpha=F _ d(+\infty) so that 0<\alpha<1, F _ 1=\frac1\alpha F _ d, F _ 2=\frac1{1-\alpha}F _ c. Now F _ 1 is a discrete distribution function and F _ 2 is a continuous distribution function. We have
\[
F=\alpha F _ 1+(1-\alpha)F _ 2.
\]

We can see even if \alpha=0 or \alpha=1 we have similar results.

Theorem 5. Every distribution function can be written as the convex combination of a discrete and a continuous one. Such decomposition is unique.

Now some results in real analysis is required here and they are reviewed (in Chinese) in https://gaomj.cn/realanalysis5/.

We put F _ {ac}(x)=\int _ {-\infty}^xF^\prime(t)\, \mathrm dt. From the knowledge of real analysis this definition makes sense, and F _ {ac} is absolutely continuous. We also put F _ s=F-F _ {ac}, then F^\prime _ {ac}=F^\prime\, \text{a.e.} so that F^\prime _ s=0\, \text{a.e.} and F _ s is singular if it is not identically zero.

Definition 6. Any positive function f that is equal to F^\prime\, \text{a.e.} is called a density of F. F _ {ac} is called the absolutely continuous part, F _ s the singular part of F.

Remark: Sometimes the density is only defined on those absolute continuous distribution functions. In this case, we always have a nice property of density: the integral of it connects well to the corresponding distribution function. Consider a continuous but not absolute continuous function, Cantor function. It can be extend to become a distribution function, but since its derivative is zero almost everywhere, the integral of it is zero so that we no longer have the nice property. Therefore the sentence ``the random variable has a density" means more often that the distribution function is absolute continuous.

It can be shown that F _ {ac},F _ s are both increasing and less than F. We are now in a position to announce
the following result, which is a refinement of theorem 5.

Theorem 7. Every distribution function F can be written as the convex combination of a discrete, a singular continuous, and an absolutely continuous distribution function. Such a decomposition is unique.

The proof is omitted here.
2. Relationship between p.m. and d.f. There is in fact a one-to-one correspondence between the set functions on the one hand, and the point functions on the other. Both points of view are useful in probability theory.

Theorem 8. Each probability measure on \mathcal B (family of Borel sets) determines a distribution function F through the correspondence
\begin{equation}\label{1}
\mu((-\infty,x])=F(x),\quad\forall x\in\mathbb R.
\end{equation}

As a consequence, we have for -\infty<a<b<+\infty: \mu((a,b])=F(b)-F(a), \mu((a,b))=F(b-)-F(a), \mu([a,b))=F(b-)-F(a-), \mu([a,b])=F(b)-F(a-).

The idea of the proof is simple. Since (-\infty,x]\in\mathcal B, \mu((-\infty,x]) is defined; call it F(x) and so define the function F on \mathbb R. We shall show that F is a distribution function. First, F is increasing. Next, F is right continuous. With the similar technique F(-\infty)=0 and F(+\infty)=1. This ends the verification that F is a distribution function. The corollary of the four relations is also simple to prove.

Theorem 9. Each distribution F determines a probability measure \mu on \mathcal B through 1.

The reader may refer to measure theory for a complete proof. The following is the basic idea, as an important review. Given distribution function F, we want a set function \mu. We first define \mu for intervals of the form (a,b] by means of \mu((a,b])=F(b)-F(a). Such a function is seen to be countably additive on its current domain of definition. Now we proceed to extend its domain of definition while preserving this additivity. If S is a countable union of such intervals which are disjoint: S=\bigcup _ i(a _ i,b _ i], we are now forced to define \mu(S), if at all, by \mu(S)=\sum _ i\mu((a _ i,b _ i])=\sum _ i[F(b _ i)-F(a _ i)]. But S may be represented in the form above differently, so we have to prove the definition of \mu on S is well-defined, i.e., \mu(S) does not depend on the representation of S. Next, we notice that any open interval (a,b) is in the extended domain, and indeed we have \mu((a,b))=F(b-)-F(a), the same as before. Now it is well known that any open set U in \mathbb R is the union of a countable sequence of disjoint open intervals, say U=\bigcup _ i(c _ i,d _ i); and this representation is unique. Hence again we are forced to define \mu(U), if at all, by \mu(U)=\sum _ i\mu(c _ i,d _ i)=\sum _ i[F(d _ i-)-F(c _ i)]. Now we find that its values of all closed sets are thereby also determined by the property of completion of probability measure. In particular, \mu(\{a\})=F(a)-F(a-). Now we also know the value of \mu on all countable sets, and so on -- all this provided that no contradiction is ever forced on us so far as we have gone. But we are still far from the \sigma-algebra \mathcal B. For the value of \mu on G _ \delta and F _ \sigma sets, there are different ways to achieve our goal but we omit the details here since it is not an easy task.

There is one more question: besides the \mu discussed above is there any other probability measure \nu that corresponds to the given F in the same way? It is important to realize that this question is not answered by the preceding theorem. Clearly we should phrase the question more precisely by considering only probability measures on \mathcal B. This will be answered in full generality by the next theorem.

Theorem 10. Let \mu and \nu be two measures defined on the same \sigma-algebra \mathcal A, which is generated by the algebra \mathcal A _ 0. If either \mu and \nu is \sigma-finite on \mathcal A _ 0, and \mu(E)=\nu(E) for every E\in\mathcal A _ 0, then the same is true for every E\in\mathcal A, and thus \mu=\nu.

Corollary 11. Let \mu and \nu be \sigma-finite measures on \mathcal B that agree on all intervals of the form: (a,b], (Whether the endpoint is included is not a problem and (a,b] can be changed to, say, (-\infty,b), and so on.) then they agree on \mathcal B.

The proof is omitted here.

Theorem 12. Given the probability measure \mu on \mathcal B, there is a unique distribution F satisfying \mu((-\infty,x])=F(x)\, (x\in\mathbb R). Conversely, given the distribution function F, there is a unique probability measure \mu satisfying \mu((-\infty,x])=F(x)\, (x\in\mathbb R) or any of the relations like \mu((a,b])=F(b)-F(a).

We shall simply call \mu the probability measure of F, and F the distribution function of \mu.

Finally, similar to the completion of measure space, we define the completion of probability space.

Definition 13. The probability space (\Omega, \mathcal F,\mathbb P) is said to be complete iff any subset of a set N in \mathcal A with \mathbb P(N)=0 also belongs to \mathcal A.

A set in \mathcal A with probability zero is called a nulled set. Like in measure theory, any probability space (\Omega,\mathcal A,\mathbb P) can be completed according to the next theorem.

Theorem 14. Given the probability (\Omega,\mathcal A,\mathbb P), there exists a complete space (\Omega,\overline{\mathcal A},\overline{\mathbb P}) such that \mathcal A\subseteq \overline{\mathcal A} and \overline{\mathbb P} is an extension of \mathbb P.

The proof may be found in some recourses in measure theory and is not given here. Let's discuss the advantage of completion. Suppose that a certain property, such as the existence of a certain limit, is known to hold outside a certain set N with \mathbb P(N)=0. Then the exact set on which the property fails to hold is a subset of N, not necessarily in \mathcal A, but will be in \overline{\mathcal A} with \overline{\mathbb P}(N)=0. We need the measurability of the exact exceptional set to facilitate certain dispositions, such as defining or redefining a function on it.


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注