Contents
Contents
1. Real Numbers
1.1. Axiomatic Definition
1.1.1. Uniqueness
1.2. * Cantor's Construction
1.2.1. Definition of \mathbb R
1.2.2. Abstract Algebra
1.2.3. Completeness
1.3. Decimal representation
2. Complex Numbers
2.0.1. Axiomatic Definition
2.0.2. Set-theoretic Definition
(Continued.)
1. Real Numbers The theory of real numbers is classical material for mathematical analysis. The main goal of this section is to define the set \mathbb R of real numbers. We will discuss some properties of the real numbers, but for the purpose of construction. More can be found in any mathematical analysis book.
Previously, we defined \mathbb Z on \mathbb N\times\mathbb N and \mathbb Q on \mathbb Z\times\mathbb Z. However we cannot define \mathbb R in the same way, because there are much more numbers in \mathbb R. The cardinal number of \mathbb Q and \mathbb Q\times\mathbb Q is \aleph _ 0, while \mathbb R has a larger one (if it has the usual properties after construction).
Intuitively, the simplest way to define a real number might be a "nonterminating decimal expansion", like x=x _ 0.x _ 1x _ 2\cdots. In essence,this expansion is nothing but a sequence of integers x _ 0,x _ 1,\dots. An unnatural property of it is that a rational number has two decimal expansions, such as 1.000\dots and 0.999\dots represent the same number 1. In addition, it is diffifult to define the algebraic operations, let alone prove the properties such as accosiativity and distributivity.
Dedekind's construction
One of the standard construction of the real numbers is to introduce particular sets in \mathbb Q, which are called Dedekind cuts, named after Dedekind. Informally, a cut is the set of all rational numbers which are strictly less than a given real number. And then, a real number is defined simply to be a cut -- There is no difference in nature between a real number and the set of all the rational numbers which are strictly less than it.
This definition leads to natural definitions of the algebraic operations, and one can verify these operations have the usual properties as expected. This done, one can forget this quite abstract construction and uses only the arithmetic rules. The construction basically shows the existence of a set, contained in \mathcal P(\mathbb Q), which possesses some specific properties.
Another common construction uses the observation that a real number can be named by giving a sequence of rational numbers converging to it. The concepts of "convergent" and "equivalent" must be defined without reference to the real number to which the sequence is converging. This can be done using Cauchy sequences. This approach to constructing \mathbb R is due to Cantor.
We will discuss more on Cantor's approach later. For Dedekind's approach, the essential ideas can be found in The construction of |R. We will also use the idea of Dedekind's cuts in some later proof.
1.1. Axiomatic Definition As has been noted, we can consider the fundamental properties of the real numbers as axioms to be accepted.
- The set \mathbb R is a field.
- There is a linear order on \mathbb R, such that for x,y,z\in\mathbb R, x\leqslant y\Rightarrow (x+z\leqslant y+z) and (0\leqslant x)\wedge(0\leqslant y)\Rightarrow0\leqslant xy.
- Archimedean axiom: If x,y>0, there exists n\in\mathbb N such that y<nx.
- Completeness axiom: Every nonempty subset of \mathbb R which is bounded above has a least upper bound in \mathbb R.
We define \mathbb{N},\mathbb Z,\mathbb Q under this axiomatic setting:
- The numbers 0, 1, 1+1, 1+1+1,\dots, are called natural numbers. They form the set \mathbb N.
- The union of the set of natural numbers and the set of negatives of natural numbers is called the set of integers, denoted by \mathbb Z.
- Numbers of the form a\cdot b^{-1}, where a,b\in\mathbb Z, are called rational numbers. They form the set \mathbb Q.
By axiom (1) and (2), \mathbb R is an ordered field. Archimedean axiom seems obvious, but it does not follow from (\emph1) and (\emph2), i.e., there exists an ordered field that is nonarchimedean. It does not appear in usual mathematics, though.
The completeness axiom, not true for \mathbb Q, characterizes the real numbers. It can be stated in several equivalent forms (although they are not introduced here). In fact Archimedean axiom can be proved from completeness, so it is common to exclude it from the axiom list.
We return to prove the theorem. First, we can show that any nonempty subset A of \mathbb N (or \mathbb Z) that is bounded from above contains a maximum element. By completeness axiom, there is a least upper bound s\in\mathbb R. Hence there is some m\in A such that s-1<m\leqslant s. We infer m=\max A, since m+1 is the least natural number larger than m and s<m+1. As a corollary, \mathbb N (or \mathbb Z) is not bounded above, otherwise there is a graetest natural number n, but n<n+1.
Similarly, any nonempty subset of integers that is bounded below contains a minimum element.
Since \mathbb N is not bounded above, the set \{n\in\mathbb N\mid y/x<n\} is nonempty for any given positive real numbers x,y. Then it is a subset of integers that is bounded below, and therefore has a least element n. We have n-1\leqslant y/x<n, implying (n-1)x\leqslant y<nx.□
We will show such a set \mathbb R of real numbers exists soon. Now we discuss the uniqueness of \mathbb R.
Suppose \mathbb R and \mathbb R' are two models of the set of real numbers that satisfy the four axioms above. Let f:\mathbb R\to\mathbb R' be a function such that
\[f(x+y)=f(x)+f(y),\quad f(xy)=f(x)f(y).\]
Suppose f is not a zero function.
We investigate the image of f.
- Setting x=0, we obtain f(0)=2f(0) and then f(0)=0'.
- Setting x=y=1, we obtain f(1)=f(1)^2. If f(1)=0', f will be a zero function. Therefore f(1)\neq0' and f(1)=1' by cancellation law.
- Setting y=1, we obtain f(x+1)=f(x)+1' and by induction f(n)=n' for any n\in\mathbb{N}. For negative integers, we have f(-n)=-f(n)=-n' by setting x=-y=n. Hence f(m)=m' for all m\in\mathbb{Z}.
- Setting x=m and y=m^{-1}, we obtain f(m^{-1})=f(m)^{-1}=(m')^{-1}. Then f(m/n)=m'/n'.
This shows f is an order preserving bijection. But it seems we have difficulty in dealing with irrational numbers. In fact we can still prove f is bijective and preserves order.
The technical details are as follows.
(Click to expand.)
Injectivity is easy to prove. It suffices to show \ker f=\{0\}. Suppose f(x)=0' for x\neq0. Then 1'=f(x)f(x^{-1})=0, a contradiction. (In the languge of algebra, the kernel of f is a proper ideal in \mathbb R, hence trivial.)
In order to prove surjectivity and order preserving, we prove an elementary property of positive real numbers.
Let S=\{x\in\mathbb R\mid (x\geqslant0)\wedge (x^2\leqslant y)\}. This is a nonempty set since 0\in S, and is bounded above: if y>1, y is an upper bound, otherwise there exists x\in S such that 1<y<x and x^2\leqslant y<x<x^2, a contradiction; and if y\leqslant1, 1 is an upper bound, otherwise there exists x\in S such that x>1 and x^2\leqslant y\leqslant1<x, a contradiction.
By completeness axiom, S has a least upper bound x. We can show that y=x^2, by contradiction. If x^2<y, we shall find a number greater than x but still in S, leading to a contradiction. We want (x+h)^2=x^2+2xh+h^2=x^2+h(2x+h)\leqslant y. It suffices to set 0<h<\min(1,\frac{y-x^2}{2x+1}), so that we have h(2x+h)<y-x^2<0 and then (x+h)^2<y. Thus x^2<y is impossible. Similarly x^2>y is also impossible. Hence x^2=y.
.
With this lemma, it is easy to prove the order preserving property of f. We want to show x\leqslant y\Leftrightarrow f(x)\leqslant f(y). It suffices to show 0\leqslant x\Leftrightarrow 0'\leqslant f(x). Suppose x\geqslant0 and y\geqslant0 is the square root of x, i.e., a real number such that y\geqslant0 and y^2=x. Then f(x)=f(y^2)=f(y)^2\geqslant0'. Suppose x<0, then f(-x)\geqslant0' and therefore f(x)\leqslant0'. The equality cannot hold, so if f(x)\geqslant0' then x\geqslant0. Thus, 0\leqslant x\Leftrightarrow 0'\leqslant f(x).
By Archimedean axiom, there exists n\in\mathbb N such that 1<n(b-a). Then 1/n<b-a. Also, there exists m\in\mathbb N such that m-1\leqslant na< m. Then b>a+\frac1n\geqslant \frac mn. Hence a<\frac mn<b.
.
We now prove f is surjective. Suppose y'\in \mathbb R' and consider the cut S'=\{q'\in\mathbb Q'\mid q'<y'\}, nonempty and bounded above. Then the set S=\{q\in\mathbb Q\mid f| _ {\mathbb Q}(q)=q'\in S'\} is also nonempty and bounded above (for example, an upper bound can be the rational whose image under f is greater than y'). By completeness axiom, S has a least upper bound x. We assert f(x)=y'.
For q\in S, we infer q<x. If q\geqslant x, then since x=\sup S, q=x is the maximum element. Therefore q' is the maximum in S', which is impossible, because q'<y' indicates that there exists r'\in S' such that q'<r'<y' and r' is a larger element in S'. Thus q<x and q'=f(q)<f(x) for all q\in S, indicating f(x) is an upper bound of S'. So y=\sup S'\leqslant f(x).
If y'<f(x), then there exists q'\in\mathbb Q' such that y'<q'<f(x). Since f preserves order, f(q)=q'<f(x)\Rightarrow q<x=\sup S. So there is some s\in S such that q<s<x and then y'<f(q)<f(s), a contradiction. Thus y'=f(x).
Existence of the function f
By the theorem above, in order to prove the uniqueness of the set of real numbers, we need to show the existence of a field homomorphism. We use the idea of Dedekind's cuts.
For rational numbers p/q\in\mathbb Q, we define f(p/q)=p'/q'\in\mathbb Q', which is shown to be an order preserving bijection, and then define
\[
F(x)=\sup\{f(q)\mid (q\in\mathbb Q) \wedge(q<x)\}.
\]
We shall show this is a homomorphism.
(Click to expand.)
First, if x,y\in\mathbb R, then
\begin{gather*}
F(x+y)=\sup\{f(q)\mid(q\in\mathbb Q) \wedge(q<x+y)\},\\
F(x)=\sup\{f(q)\mid (q\in\mathbb Q) \wedge(q<x)\}, \\
F(y)=\sup\{f(q)\mid (q\in\mathbb Q) \wedge(q<y)\}.
\end{gather*}
For any rational q<x+y, it is not diffifult to find r,s\in\mathbb Q such that r<x, s<y and q<r+s. Then f(q)<f(r)+f(s)\leqslant F(x)+F(y), so F(x)+F(y) is an upper bound and F(x+y)\leqslant F(x)+F(y). Conversely, if r,s\in\mathbb Q such that r<x and s<y, then r+s<x+y and f(r)+f(s)=f(r+s)\leqslant F(x+y). We obtain F(x)\leqslant F(x+y)-f(s) and then F(y)\leqslant F(x+y)-F(x). We conclude that F(x+y)=F(x)+F(y).
Next we shall show F(xy)=F(x)F(y). Consider the case where x,y\geqslant0. For any rational 0\leqslant q<xy, if q=0 then obviously f(q)=0'\leqslant F(x)F(y); if 0<q<xy, it is not diffifult to find r,s\in\mathbb Q such that r<x, s<y and q<rs. Therefore f(q)<f(r)f(s)\leqslant F(r)F(s). This holds for all rational q<xy, so F(xy)\leqslant F(x)F(y). Conversely, if r,s\in\mathbb Q such that 0\leqslant r<x and 0\leqslant s<y, then rs<xy and f(r)f(s)=f(rs)\leqslant F(xy). Similarly to addtion we obtain F(r)F(s)\leqslant F(xy). We conclude that F(xy)=F(x)F(y) for all x,y\geqslant0. The negative case follows by F(-x)=-F(x). Note that we have proved the additive homomorphism.
We summarize our results in the following theorem.
In this section, we follow Cantor's approach to construct the set of real numbers, implying the existence of the axiomatic definition. In this way, we embed the axiomatic definition into ZFC set theory.
Our current definition is: \mathbb N is the least inductive set, \mathbb Z=\mathbb N\times\mathbb N\mathbin/\sim and \mathbb Q=\mathbb Z\times\mathbb Z^\ast\mathbin/\sim', where \sim and \sim' are the corresponding equivalence relations. The goal is to construct a set \mathbb R.
We return to the informal idea that a real number is the limit of a convergent sequence of rational numbers. The problem is that the limit theory in analysis is based on the concept of real numbers. Therefore, we need to characterize the limit of a rational sequence without reference to it. Cauchy sequences are the suitble tools.
\mathbb Q with the usual metric
We begin with the property that \mathbb Q with the usual metric is a metric space. The usual metric defined by d(x,y)=|x-y| induces the usual topology on \mathbb R or \mathbb C. This was stated informally, but now we can prove \mathbb Q equipped with this function d is a metric space.
We have known the < order on \mathbb Q is a strict linear order, so for r\in\mathbb Q exactly one of (r>0, r=0, r<0) holds. Alternatively, exactly one of (r>0, r=0, -r>0) holds. The absolute value of r is defined to be -r if -r>0 and r otherwise.
We shall verify: if x,y,z\in\mathbb Q, then d(x,x)=0, d(x,y)>0 for x\neq y, d(x,y)=d(y,x) and the triangle inequality d(x,z)\leqslant d(x,y)+d(y,z). All of these have been covered or derived directly from the previous discussion except for the triangle inequality. It is easy to prove: The absolute value has the property that |x|=\max\{x,-x\}, so we have d(x,y)+d(y,z)=|x-y|+|y-z|\geqslant x-y+y-z=x-z and |x-y|+|y-z|=|y-x|+|z-y|\geqslant y-x+z-y=z-x, indicating d(x,z)=|x-z|\leqslant d(x,y)+d(y,z). In particular, setting y=0 we obtain |x+ z|\leqslant |x|+|z|.
Cauchy sequences
Let (a _ n) be a sequence in a metric space X, i.e., a function from \mathbb N to X: (a _ n)\in X^{\mathbb N}.
\[
(\forall\varepsilon>0)(\exists N\in\mathbb N)(\forall m,n\geqslant N)\, d(a _ m,a _ n)<\varepsilon.
\]
For now, the term "for each \varepsilon>0" means "for each \varepsilon\in\mathbb Q _ +(the set of positive rationals)". After the set \mathbb R is constructed, it equivalently means "for each \varepsilon\in\mathbb R _ +". This term is used to de-emphasize the distinction.
Roughly speaking, the feature of a Cauchy sequence is that, provided two members are taken sufficiently far out in the sequence -- regardless of their relative positions -- the distance can be arbitrarily small. In the limit theory of analysis, a sequence converges if and only if it is a Cauchy sequence.
Let R be the set of all Cauchy sequences in \mathbb Q. This is not the set of real numbers that we desire, since two different sequences may have the same limit. Therefore we define an equivalence relation \sim: two sequences (a _ n) and (b _ n) are equivalent if and only if the limit of (a _ n-b _ n) is zero, i.e., (\forall \varepsilon>0)(\exists N\in\mathbb N)(\forall n\geqslant N)\, |a _ n-b _ n|<\varepsilon. It is convenient to introduce the terminology of null sequence: a sequence whose limit is zero. Then, (a _ n)\sim (b _ n) if and only if (a _ n-b _ n) is a null sequence.
We shall verify \sim is an equivalence relation, and then the set of real numbers should be R/\sim. If this is done, the algebraic operations can be defined: for x=[(a _ n)] and y=[(b _ n)], x+y is defined to be [(a _ n+b _ n)] and xy is defined to be [(a _ nb _ n)], both required verification to be well-defined, i.e., [(a _ n+b _ n)] and [(a _ nb _ n)] are equivalence classes of Cauchy sequences and will not change if different representatives are chosen. In addition, an order relation needs to be defined.
Null sequences
Since the relation \sim can be characterized by the null sequenecs (the sequences which converge to zero), we investigate some properties of them. Let \mathfrak c _ 0 be the set of all null sequences in \mathbb Q.
\[\mathfrak{c} _ 0=\{(x _ n)\in\mathbb Q^{\mathbb N}\mid (\forall \varepsilon>0)(\exists N\in\mathbb N)(\forall n\geqslant N)\, |a _ n-b _ n|<\varepsilon\}.\]
(Click to expand.)
These are very classical results in analysis. In metric space X, a subset Y\subseteq X is said to be bounded if there is M>0 such that d(x,y)\leqslant M for all x,y\in Y. A sequence (x _ n) is bounded if its image \{x _ n\mid n\in\mathbb N\} is bounded.
Suppose (x _ n)\in R. Then there is N\in\mathbb N such that d(x _ m,x _ n)<1 for all m,n>N. In particular, d(x _ n,x _ N)<1. Suppose M=\max\{d(x _ n,x _ N)\mid n<N\}. (The maximum of a finite set in \mathbb Q exists, which can be seen easily.) Then for all n\in\mathbb N we have d(x _ n,x _ N)\leqslant 1+M. For any m,n\in\mathbb N, d(x _ m,x _ n)\leqslant d(x _ m,x _ N)+d(x _ N,x _ n)\leqslant 2+2M. Hence (x _ n) is bounded.
Now suppose (x _ n)\in\mathfrak{c} _ 0 and \varepsilon>0. There is N\in\mathbb N such that |x _ n|<\varepsilon/2 for all n\geqslant N. Therefore |x _ m-x _ n|\leqslant |x _ m|+|x _ n|<\varepsilon for all m,n>N.
Suppose (a _ n),(b _ n)\in \mathfrak c _ 0 and \varepsilon>0. There are M,N\in\mathbb{N} such that |a _ n|<\varepsilon/2 for n\geqslant M and |b _ n|<\varepsilon/2 for n\geqslant N. Hence for n>\max\{M,N\} we have |a _ n+b _ n|\leqslant|a _ n|+|b _ n|\leqslant \varepsilon. This shows (a _ n+b _ n)\in\mathfrak{c} _ 0. Moreover, (b _ n) is bounded, so it is easy to see there is M'>0 such that |b _ n|\leqslant M'. Then for \varepsilon/M'>0, there is N'\in\mathbb N such that |a _ n|<\varepsilon/M', so |a _ nb _ n|<\varepsilon. This shows (a _ nb _ n)\in\mathfrak{c} _ 0. Similarly, since (r _ n)\in R is bounded, (r _ n\cdot a _ n)\in\mathfrak{c} _ 0.□
With this proposition, \sim can be verified to be an equivalence relation easily. Since (a _ n)\sim(b _ n) if and only if (a _ n-b _ n)\in\mathfrak c _ 0, so (a _ n)\sim(a _ n) by (a _ n-a _ n)\in\mathfrak{c} _ 0, (a _ n)\sim(b _ n)\Rightarrow(b _ n)\sim(a _ n) by (b _ n-a _ n)=(-(a _ n-b _ n))\in\mathfrak{c} _ 0 and (a _ n)\sim(b _ n)\sim(c _ n)\Rightarrow(a _ n)\sim(c _ n) by (a _ n-c _ n)=((a _ n-b _ n)+(b _ n-c _ n))\in\mathfrak{c} _ 0.
1.2.1. Definition of \mathbb R We return to the set R of Cauchy sequences in \mathbb Q. It can be proved that R is closed under addition and multiplication.
(Click to expand.)
Given \varepsilon>0, there is N _ 1\in\mathbb N such that for all m,n\geqslant N _ 1, |a _ n-a _ m|<\varepsilon/2 and |b _ n-b _ m|<\varepsilon/2. Then |a _ n+b _ n-(a _ m+b _ m)|\leqslant |a _ n-a _ m|+|b _ n-b _ m|< \varepsilon. This shows (a _ n+b _ n)\in R.
Since every Cauchy sequence is bounded, there is M>0 such that |a _ n|<M and |b _ n|<M for all n\in\mathbb N. For this M, there is N _ 2\in\mathbb N such that for all m, n\geqslant N _ 2, |a _ n-a _ m|<\varepsilon/(2M) and |b _ n-b _ m|<\varepsilon/(2M). Then |a _ nb _ n-a _ mb _ m|\leqslant |a _ n||b _ n-b _ m|+|a _ n-a _ m||b _ m|<\varepsilon. This shows (a _ nb _ n)\in R.□
In the following we define the addition and multiplication operations on \mathbb R, and then an order relation.
Suppose [(a _ n)],[(b _ n)]\in R/\sim. Define:
\[
[(a _ n)]+[(b _ n)]:=[(a _ n+b _ n)],\quad [(a _ n)]\cdot[(b _ n)]:=[(a _ nb _ n)].
\]
By the proposition above, these operations make sense. In order to prove they are well-defined, we need to show that if (a _ n)\sim(a _ n') and (b _ n)\sim(b _ n') then (a _ n+b _ n)\sim(a _ n'+b _ n') and (a _ n b _ n)\sim(a _ n' b _ n'). This can be seen easily with the properties of null sequences. We have (a _ n-a _ n')\in\mathfrak{c} _ 0 and (b _ n-b _ n')\in\mathfrak{c} _ 0, so (a _ n+b _ n-a _ n'-b _ n')\in\mathfrak{c} _ 0 and (a _ n-a _ n')(b _ n-b _ n')\in\mathfrak{c} _ 0. The addition is justified. For multiplication, let c _ n=a _ n-a _ n'\in\mathfrak{c} _ 0 and d _ n=b _ n-b _ n'\in\mathfrak{c} _ 0. Then
\[
a _ nb _ n-a _ n'b _ n'=(a _ n'+c _ n)(b _ n'+d _ n)-a _ n'b _ n'=a _ n'd _ n+c _ nb' _ n+c _ nd _ n.
\]
By Propsition 11, (a _ n'd _ n),(c _ nb' _ n),(c _ nd _ n) are all null sequences. Thus their sum is also a null sequence, i.e., (a _ nb _ n-a _ n'b _ n')\in\mathfrak{c} _ 0. The multiplication is also justified.
Order relation on \mathbb R
A sequence is a null sequence if it can be arbitrarily small for large n. On the contrary, if a Cauchy sequence (a _ n) is not a null sequence, how does it behave? It turns out that the sequence is "bounded away" from zero, positively or negatively: There is a positive \varepsilon>0 such that a _ n>\varepsilon for all n>N, or a _ n<-\varepsilon for all n>N, where N\in\mathbb N is a natural number depending on \varepsilon. Let P be the set of all Cauchy sequences "bounded away from zero positively":
\[P=\{(a _ n)\in R\mid (\exists\varepsilon>0) (\exists N\in\mathbb N)(\forall n>N)\, a _ n>\varepsilon\}.\]
We define an order on \mathbb R as follows:
For [(a _ n)],[(b _ n)]\in R/\sim, [(a _ n)]\leqslant[(b _ n)] if and only if (a _ n)\sim (b _ n) or (b _ n-a _ n)\in P. Informally the condition is that (b _ n-a _ n) is a null sequence (which implies [(a _ n)]=[(b _ n)]) or a Cauchy sequence bounded away from zero positively. One can verify that this definition is independent of the choice of representatives. We omit the verification here.
As usual < means \leqslant and \neq. Since [(x _ n)]=[(y _ n)]\in\mathbb R means (y _ n-x _ n)\in\mathfrak{c} _ 0, we have [(x _ n)]<[(y _ n)] if and only if (y _ n-x _ n)\in P.
For any x,y,z\in\mathbb R, clearly x\leqslant x. Suppose x\leqslant y\leqslant z. It can be seen that if the Cauchy sequence of y-x or z-y is bounded away from zero positively then z-x=(z-y)+(y-x) is bounded away from zero positively. The case x=y=z is trivial. Hence the transitivity holds. Suppose x\leqslant y and y\leqslant x. Then x,y must coincide, otherwise the Cauchy sequences of y-x and x-y both are bounded away from zero positively, impossible. Thus \leqslant is a partil order.
To prove it is also a linear order, we prove the following proposition.
Given \varepsilon>0, there is N _ 1\in\mathbb N such that |x _ n-x _ m|<\varepsilon/2 for all m,n\geqslant N _ 1. There is a null subsequence (x _ {n _ k}) in (x _ n), so there is N _ 2\in\mathbb N such that |x _ {n _ k}|<\varepsilon/2 for all k\geqslant N _ 2. Now for n\geqslant N:=\max\{N _ 1,N _ 2\}, it holds that |x _ n|\leqslant |x _ n-x _ {n _ N}|+|x _ {n _ N}|<\varepsilon. This shows (x _ n)\in\mathfrak{c} _ 0.□
.
Suppose (a _ n),(b _ n)\in R. If neither (b _ n-a _ n) nor (a _ n-b _ n) is in P, then for any N\in\mathbb N, there is n>N such that |c _ n|:=|b _ n-a _ n|\leqslant1/N. A sequence n _ k in \mathbb{N} can be defined recursively: n _ 0=1, and n _ {k+1} is a natural number such that n _ {k+1}>n _ k and |c _ {n _ {k+1}}|\leqslant 1/n _ k. The sequence n _ k is strictly increasing.
For any \varepsilon>0, there is N\in\mathbb{N} such that 1/N<\varepsilon, so for k\geqslant N, 1/n _ k\leqslant 1/N<\varepsilon and then |c _ {n _ {k+1}}|\leqslant 1/n _ k<\varepsilon. This shows |c _ {n _ k}|<\varepsilon for all k>N. The subsequence (c _ {n _ k}) is null, so by the preceding proposition, (c _ n)=(b _ n-a _ n)\in\mathfrak{c} _ 0. We conclude that at least one of ((b _ n-a _ n)\in P, (a _ n-b _ n)\in P, (b _ n-a _ n)\in\mathfrak{c} _ 0) holds. This implies [(a _ n)]\leqslant[(b _ n)] or [(b _ n)]\leqslant[(a _ n)], i.e., \leqslant is a linear order.□
.
The order relation \leqslant can be verified to possess the following properties: if x,y,z\in\mathbb R, then x\leqslant y\Rightarrow(x+z)\leqslant(y+z) and (0\leqslant x)\wedge(0\leqslant y)\Rightarrow 0\leqslant xy. Here 0 denotes the equivalence class of the null sequences. The verification is not diffifult. For example if [(0)]<[(x _ n)] and [(0)]<[(y _ n)], then (x _ n-0)=(x _ n)\in P and (y _ n)\in P. There is \varepsilon>0 and N\in\mathbb N such that x _ n,y _ n>\varepsilon for all n>N. Therefore x _ ny _ n>\varepsilon^2>0 and (x _ ny _ n)\in P. This shows (0<x)\wedge(0<y)\Rightarrow(0< xy). The other cases are proved similarly.
Embedding of \mathbb Q into \mathbb R
A constant consequence (r) in \mathbb Q is clearly a Cauchy sequence. Therefore the function
\[\iota:\mathbb Q\to\mathbb R,\quad r\mapsto \iota(r)=[(r)] \]
is an injection from \mathbb Q into \mathbb R. Since \iota preserves addition, multiplication, as well as order (as can be seen below), it is reasonable to denote [(r)] simply by r.
It can be seen easily that the order \leqslant on \mathbb R induces the usual order on \mathbb Q. If p,q\in\mathbb Q and [(p)]\leqslant [(q)], then (q-p)\in\mathfrak{c} _ 0 or (q-p)\in P. Since (q-p)\in\mathfrak c _ 0\Rightarrow q=p and (q-p)\in P\Rightarrow p<q, we have p\leqslant q.
Multiplicative inverses
We need one more property of \mathbb R: every x\neq0 in \mathbb R has a multiplicative inverse. Suppose [(x _ n)]\in\mathbb R and [(x _ n)]\neq0, which implies (x _ n)\in P or (-x _ n)\in P.
First suppose (x _ n)\in P. There is \varepsilon _ 0>0 and N _ 1\in\mathbb N such that x _ n>\varepsilon _ 0 for all n\geqslant N _ 1. Then, since (x _ n) is a Cauchy sequence, for any \varepsilon>0 there is N _ 2\geqslant N _ 1 such that |x _ n-x _ m|<\varepsilon\cdot\varepsilon _ 0^2 for all m,n\geqslant N _ 2. Define (y _ n) to be x _ n^{-1} for n\geqslant N _ 2 and be 0 for n<N _ 2. Then by the property that absolute value is compatible with multiplication (which is easy to see),
\[
|y _ n-y _ m|=\frac{|x _ m-x _ n|}{|x _ nx _ m|}< \frac{\varepsilon\cdot\varepsilon _ 0^2}{\varepsilon _ 0^2}=\varepsilon,\quad \forall m,n\geqslant N _ 2.
\]
Therefore (y _ n)\in R. We have [(x _ n)][(y _ n)]=[(x _ ny _ n)]=1.
If (-x _ n)\in P, then [(-x _ n)]^{-1} has an inverse [(z _ n)]. Then (-x _ nz _ n)\sim(1) and hence the inverse of [(x _ n)] is [(-z _ n)].
1.2.2. Abstract Algebra We summarize our current results in the language of abstract algebra.
We have known \mathbb Q is a commutative ring with identity 1\neq0, so is \mathbb Q^\mathbb N if we define addition and multiplication componentwise as usual. The constant sequence (1) is the identity of \mathbb Q^\mathbb N.
There is an isomorphism from the ordered ring \mathbb Q to a subset of \mathbb R.
1.2.3. Completeness We shall prove \mathbb R is complete, in the sense that a nonempty subset bounded above has a least upper bound. It turns out that this property can be characterized by some equivalent statements, e.g., the following theorem.
In order to prove this theorem, we investigate the increasing sequences in \mathbb Q which is bounded above. We will show that they are Cauchy sequences.
Suppose (a _ n)\in\mathbb Q^{\mathbb N} is increasing and bounded above. Then for all n\in\mathbb N, a _ 0\leqslant a _ n\leqslant M for some M\in\mathbb Q. If a _ n=M for some n, then clearly M=\sup(a _ n). Therefore we consider the case a _ n<M for all n\in\mathbb N.
Set \Delta=M-a _ 0 and divide the set \{r\in\mathbb Q\mid a _ 0\leqslant r< M\} into N parts (N\in\mathbb{N}^\ast): every part consists of the rationals between a _ 0+\frac{k-1}{N}\Delta and a _ 0+\frac kN\Delta (k=1,\dots,N). We collect the indexes in each part:
\begin{gather*}
A _ k:=\{r\in\mathbb Q\mid a _ 0+\tfrac{k-1}{N}\Delta\leqslant r< a _ 0+\tfrac{k}{N}\Delta\},\\
I _ k:=\{n\in\mathbb N\mid a _ n\in A _ k\}.
\end{gather*}
The union of all I _ k is \mathbb N, so at least one I _ k is nonempty. As a finite nonempty set, \{k\in\{1,\dots,N\}\mid I _ k\neq\varnothing\} has a maximum element K. Hence there is N _ 1\in\mathbb N such that when n\geqslant N _ 1, a _ 0 + \frac {K-1} {N} \Delta \leqslant a _ n<{a _ 0} + \frac K N. We obtain 0\leqslant a _ n - a _ m<\frac {K} {N} - \frac {K-1} {N} = \frac 1N for all n > m > N _ 1. It can be seen immediately that (a _ n) is a Cauchy sequence.
We have proved the following lemma.
Now suppose (x _ n)\in\mathbb{R}^{\mathbb N} is an increasing sequence in \mathbb R that is bounded above. If there is n\in\mathbb N such that x _ n=x _ {n+1}=\cdots, then x _ n is the supremum of the sequence. Otherwise, a subsequence can be extracted recursively so that it is strictly increasing. It can be seen that this subsequence shares the same upper bound as (x _ n). Hence we only need to consider the case where (x _ n) is strictly increasing.
Suppose for each n, x _ n=[(r _ {n,i}) _ {i\in\mathbb N}], where (r _ {n,i})\in R is a Cauchy sequence in \mathbb Q. Since x _ n<{x _ {n+1}}, (r _ {n+1,i}-r _ {n,i})\in P, so there is \delta _ n>0 and N _ n\in\mathbb N such that r _ {n+1,m}-r _ {n,m}>\delta _ n for all m\geqslant N _ n.
Since (r _ {n,i}) and (r _ {n+1,i}) are Cauchy sequences, there is i(n)\in\mathbb N such that |r _ {n,i+m}-r _ {n,i}|<\delta _ n/4 and |r _ {n+1,i+m}-r _ {n+1,i}|<\delta _ n/4 for all i\geqslant i(n) and m\in\mathbb N. We may assume i(n)\geqslant N _ n and i(n+1)\geqslant i(n).
Set s _ n=r _ {n,i(n)}+\delta _ n/2. Then for all m\in\mathbb N,
\begin{align*}
r _ {n+1,i(n)+m}-s _ n&=r _ {n+1,i(n)+m}-r _ {n,i(n)}-\delta _ n/2\\
&>r _ {n+1,i(n)}-\delta _ n/4-r _ {n,i(n)}-\delta _ n/2\\
&>\delta _ n-\delta _ n/4-\delta _ n/2=\delta _ n/4.\\
s _ n-r _ {n,i(n)+m}&=r _ {n,i(n)}+\delta _ n/2-r _ {n,i(n)+m}\\
&>r _ {n,i(n)}+\delta _ n/2-r _ {n,i(n)}-\delta _ n/4\\
&=\delta _ n/4.
\end{align*}
This shows (r _ {n+1,i}-s _ n) _ {i\in\mathbb N}\in P and (s _ n-r _ {n,i}) _ {i\in\mathbb N}\in P. Also,
\[
s _ {n+1}>r _ {n+1,i(n+1)}>s _ n+\delta _ n/4>s _ n.
\]
Thus we obtain an increasing sequence (s _ n)\in\mathbb Q^{\mathbb N} such that x _ n<s _ n<x _ {n+1} for all n\in\mathbb N. Suppose M\in\mathbb R is an upper bound of (x _ n), then s _ n<x _ {n+1}\leqslant M for all n, implying M is also an upper bound of (s _ n). By the lemma above, (s _ n) is a Cauchy sequence. Let s=[(s _ n)]\in\mathbb R. We have s _ n\leqslant s for all n\in\mathbb N. Hence x _ n<s _ n\leqslant s, i.e., s is an upper bound of (x _ n).
We further show that s=\sup(x _ n). If x\in\mathbb{R} and x _ n\leqslant x<s for all n\in\mathbb N, then s _ n<{x _ {n+1}}\leqslant x<{s}. Therefore the Cauchy sequence of s-x is bounded away from zero for any n\in\mathbb N. If x=[(q _ i) _ {i\in\mathbb N}], then there is \varepsilon>0 and i _ 0\in\mathbb N such that s _ i-q _ i>\varepsilon for all i\geqslant i _ 0, but s _ n<x implies s _ n<q _ i for large i. The contradiction shows that s=\sup(x _ n).
The negative case is proved similarly.□
A classical result in analysis. Here is a standard proof.
Let A be a nonempty subset of \mathbb R with an upper bonud M. We construct recursively a sequence (a _ n) and a sequence (b _ n) as follows: Let a _ 0 be an arbitrary number in A and b _ 0=M. For any n\in\mathbb N, if there is x\in A such that x\geqslant c _ n=(a _ n+b _ n)\mathbin/2, then let a _ {n+1}=c _ n and b _ {n+1}=b _ n. Otherwise, let b _ {n+1}=c _ n and a _ {n+1}=a _ n. By construction, (a _ n) is increasing and (b _ n) is decreasing, and 0\leqslant b _ n-a _ n=(b _ 0-a _ 0)\mathbin/2^n. Since (a _ n) has an upper bound b _ 0 and (b _ n) has a lower bound a _ 0, there exist \alpha=\sup(a _ n) and \beta=\inf(b _ n), by the preceding theorem. It is easily obtained that 0\leqslant \beta-\alpha\leqslant\inf\{(b _ 0-a _ 0)/2^n\} and that \inf\{(b _ 0-a _ 0)/2^n\}=0, so \alpha=\beta. Now for any x\in A, x\leqslant b _ n\Rightarrow x\leqslant\beta=\alpha, indicating \alpha is an upper bound of (x _ n). In addition, since (a _ n) is upper bounded by b _ 0=M, \alpha\leqslant M. This applies to any upper bound of A. Hence \alpha=\sup A.□
.
Up to now, we have verified that the set \mathbb R is a complete ordered field, i.e., the construction in this section implies the axiomatic definition. This construction formalizes the idea that a real number is characterized by a sequence "approaching to it". This is intuitive and leads to natural definitons of algebraic operations, while the proof of completeness is sort of convolved. Dedekind's construction implies completeness easily, but more effort and caution are needed when we define the algebraic operations.
Much more can be discussed on the completeness of \mathbb R. It is an important part in mathematical analysis, but we will not go into it here.
1.3. Decimal representation Now it can be seen easily the the (Cauchy) sequences (0.9,0.99,0.999,\dots) and (1,1,1,\dots) represent the same number, for the difference of the two is just a null sequence.
Conversely, we can determine the decimal representation of a real number with a standard procedure, or obtain an approximation of it with any degree of precision. We just need to consider positive numbers, for the negative one only a negative sign needs to be added.
This is sort of similar to the Archimedean axiom, and we prove it by completeness. We assert the set \{10^k\mid k\in\mathbb N\} has no upper bound. Otherwise it has a least upper bound s. Then there is n\in\mathbb N such that s/10<10^n\leqslant s\Rightarrow s<10^{n+1}, a contradiction. Given any x\in\mathbb R _ +, there is N\in\mathbb N such that x<10^n for all n\geqslant N. It follows that given any x'\in\mathbb R _ + there is N'\in\mathbb N such that 10^{-n}<x' for all n>N'. Here 10^{-n} is the multiplicative inverse of 10^n.
This shows that \{n\in\mathbb Z\mid x<10^n\} is bounded below. Then we infer it has a minimum p. For this p, clearly 10^{p-1}\leqslant x<10^p. The uniqueness can be seen easily.□
.
Now 1\cdot 10^{p-1}\leqslant x<10\cdot 10^p. By Theorem 2, there is a natural x _ p\in\mathbb N such that x _ p\cdot 10^p\leqslant x<(x _ p+1)\cdot 10^p. We infer that x _ p\in\{1,2,\dots,9\}.
Then we can consider x-x _ p\cdot 10^p, which is between 0 and 10^p. We rewrite it as 0\cdot 10^{p-1}\leqslant x-x _ p\cdot 10^p<10\cdot 10^{p-1}. Then similarly, there is a uniqie x _ {p-1}\in\{0,1,\dots,9\} such that
\[x _ p\cdot10^p+x _ {p-1}\cdot10^{p-1}\leqslant x<x _ p\cdot10^p+(x _ {p-1}+1)\cdot10^{p-1}.\]
Repeat the process and obtain a sequence of rational numbers r _ n such that
\begin{gather*}
r _ n=x _ p\cdot10^p+\dots+x _ {p-n}\cdot 10^{p-n},\\
r _ n\leqslant x<r _ n+10^{p-n}=r _ n+\frac{1}{10^{n-p}}.
\end{gather*}
We can represent x by
\[x=x _ p\cdots x _ 1x _ 0.x _ {-1}x _ {-2}\cdots,\]
which is called the decimal representation of x. For example, 1=1.00\cdots and \pi=3.14\cdots.
This algorithm gives different representations for different numbers, and different representations must come from different numbers. This is not diffifult to verify. For example, suppose that there are two representations with sequences (r _ n),(r _ n') respectively, and that r _ n\neq r _ n' and r _ {k}=r _ k' for k<n. We may assume r _ n<r _ n'. Then r' _ {n}-r _ n\geqslant 10^{p-n}, and therefore r' _ {n}\geqslant r _ n+10^{p-n}. Considering the result r _ n\leqslant x<r _ n+10^{p-n}, we assert that the two sequences represent different numbers.
An impossible case
We can show the following representation is impossible for the algorithm:
\[\cdots x _ 0.x _ {-1}\cdots x _ {p-k}999\dots,\]
i.e., the digits are all 9 after some position.
For illustration, suppose we have a representation x=0.999\cdots. Then r _ n=9\cdot 10^{-1}+\dots+9\cdot 10^{-n} and r _ n\leqslant x<r _ n+10^{-n}=1, so 0<1-x\leqslant10^{-n} for all n\geqslant1. This contradicts our previous result. The general case is similar and it is omitted here.
One-to-one correspondence
A real number has a decimal representation that has no endless 9s. Conversely for such a representation there exists a real number with this representation.
For illustration, suppose we have a representation 3.14. Then r _ 0=3\leqslant r _ 1=3+1/10\leqslant r _ 2=3+1/10+4/100<3+1/10+5/100\leqslant3+2/10\leqslant4. Take x=\sup(r _ n), which is verifiably equal to the infimum of the right sequence (r _ n+10^{-n}). It can be proved that r _ n\leqslant x<r _ n+10^{-n} holds. Therefore we obtain a real number x and its decimal representation. (Note that the representation is unique by the algorithm.)
2. Complex Numbers
Finally in this chapter, we discuss briefly the definition of complex numbers. The set of complex numbers is denoted by \mathbb C. We will extend the real field and obtain the following inclusion:
\[\mathbb N\subseteq\mathbb Z\subseteq\mathbb Q\subseteq\mathbb R\subseteq\mathbb C.\]
We made a great effort to define \mathbb R, but fortunately it is much easier for \mathbb C.
2.0.1. Axiomatic Definition For algebraic purpose, the extension from \mathbb N to \mathbb Z is to introduce additive inverses so that the equations of the form n+x=0 have a solution, and the extension from \mathbb Z to \mathbb Q is to introduce multiplicative inverses so that r\cdot x=1 have a solution for a nonzero r. The extension from \mathbb Q to \mathbb R is for analytical purposes, so as to obtain an order complete field. The extension of \mathbb R is motivated by algebraic purposes again, and it turns out that the result of the extension is still complete.
One may think the complex numbers are introduced to provide roots for the quadratic equations ax^2+bx+c=0 when b^2-4ac<0. In fact, historically people found formulas for the roots of equations of degree three, and found the amazing fact that they can provide real roots even if a square root of a negative number is involved. This led to the introduction of the number \sqrt{-1}, the square of which is -1. Euler introduced the notation \mathrm i to denote it. The complex numbers are then of the form a+b\mathrm i.
It is believed that geometrically there is a one-to-one correspondence from a straight line to the set of real numbers, so one may not adopt the notion of complex numbers at first.
As in the case of the set \mathbb R, it is worth considering these numbers abstractly and forgetting the physical existence or interpretation of these numbers.
- \mathbb C is a field.
- \mathbb R is a subfield of \mathbb C.
- There is a number i\in\mathbb C such that i^2=-1.
- Every z\in\mathbb C has the form z=x+y\mathrm i, where x,y\in\mathbb R.
If z=x+y\mathrm i, then x,y are unique. This is easy to verify.
Now we have: \mathbb C=\{x+y\mathrm i\mid x,y\in\mathbb R\}.
2.0.2. Set-theoretic Definition
One can always identify a complex number x+y\mathrm i with the ordered pair (x,y), indicating another definition using just ordered pairs.
\begin{gather*}
(a,b)+(c,d)=(a+c,b+d),\\
(a,b)\cdot(c,d)=(ac-bd,ad+bc).
\end{gather*}
The mapping x\mapsto (x,0) from \mathbb R to \mathbb C is injcetive and preserves addition and multiplication. Hence it is reasonable to identity rach x\in\mathbb R with (x,0).
Define \mathrm i=(0,1). Since (0,1)\cdot(0,1)=(-1,0), we have \mathrm i^2=-1, and
\[(x,y)=(x,0)+(0,y)=(x,0)+(y,0)(0,1)=x+y\mathrm i.\]
One can check the above definition implies the axiomatic definition. For a next step, we can define the absolute value and the conjugate of a complex number, which possess some useful properties. However we will not do it here.
We have successfully defined the complex numbers using sets. Then, even a simple mathematical object, a real or complex number, is a set of sets of sets of ... of sets. In practice, one of course ignores its set structure and works in the axiomatic definition.

发表回复