Cantor Set
The Cantor set is set of points lying on a line segment. It is created by taking some interval, for instance $[0,1],$ and removing the middle third $\left(\frac{1}{3},\frac{2}{3}\right)$, then removing the middle third of each of the two remaining sections $\left(\frac{1}{9},\frac{2}{9}\right)$ and $\left(\frac{7}{9},\frac{8}{9}\right)$, then removing the middle third of the remaining four sections, and so on ad infinitum.
It is a closed set consisting entirely of boundary points, and is an important counterexample in set theory and general topology. When learning about cardinality, one is first shown subintervals of the real numbers, $\mathbb{R}$, as examples of uncountably infinite sets. This might suggest that any uncountably infinite subset of $\mathbb{R}$ must contain an interval; such an assertion is false. A counterexample to this claim is the Cantor set $\mathcal{C} \subset [0,1]$, which is uncountable despite not containing any intervals. In addition, Cantor sets are uncountable, may have 0 or positive Lebesgue measures, and are nowhere dense. Cantor sets are the only disconnected, perfect, compact metric space up to a homeomorphism.
Contents
Construction
The Cantor set is constructed by removing increasingly small subintervals from $[0,1]$. In the first step, remove $\left(\frac13, \frac23\right)$ from $[0,1]$. In the second step, remove $\left(\frac19, \frac29\right) \cup \left(\frac79, \frac89\right)$ from what remains after the first step. In general, on the $n^\text{th}$ step, remove $\left(\frac{1}{3^n}, \frac{2}{3^n}\right) \cup \left(\frac{4}{3^n}, \frac{5}{3^n}\right) \cup \cdots \cup \left(\frac{3^n - 2}{3^n}, \frac{3^n - 1}{3^n}\right)$ from what remains after the $(n-1)^\text{th}$ step. After all $\mathbb{N}$ steps have been taken, what remains is the Cantor set $\mathcal{C}$.
This construction can be formalized as follows. Let $\mathcal{C}_0 = [0,1]$. Then, after the first step, what remains is $\mathcal{C}_1 = \left[0,\frac13\right] \cup \left[\frac23,1\right]$. After the second step, what remains is $\mathcal{C}_2 = \left[0,\frac{1}{9}\right] \cup \left[\frac{2}{9}, \frac{1}{3}\right] \cup \left[\frac{2}{3},\frac{7}{9}\right] \cup \left[\frac{8}{9}, 1\right].$ Continuing in this manner, one obtains an infinite collection of sets $\{\mathcal{C}_i\}_{i=0}^{\infty}$ such that $\mathcal{C}_i \subset \mathcal{C}_{i-1}$ for all $i\ge 1$. Then, one defines the Cantor set to be $\mathcal{C} = \bigcap_{i=0}^{\infty} \mathcal{C}_i.$ Intuitively, this makes sense; since $\bigcap_{i=0}^{n} \mathcal{C}_i = \mathcal{C}_n$ for all $n\ge 1$, taking an infinite intersection would provide the "limiting set" in this situation.
Properties
There is an alternate characterization that will be useful to prove some properties of the Cantor set:
$\mathcal{C}$ consists precisely of the real numbers in $[0,1]$ whose base-3 expansions only contain the digits 0 and 2.
Base-3 expansions, also called ternary expansions, represent decimal numbers on using the digits $0,1,2$. For instance, the number 3 in decimal is 10 in base-3. Fractions in base-3 are represented in terms of negative powers of 3, where a fraction $N$ in decimal can be represented as $N = c_1b^{-1}+c_2b^{-2}+c_3b^{-3}+\cdots+c_n{b^{-n}}+\cdots,$ where $0≤ c_i <3$, and the fraction is written in base-3 as $N = 0.c_1c_2c_3...$.
For instance, $\frac14 = 0.25 = 0\times 3^{-1} + 2\times 3^{-2} + 0\times 3^{-3} + 2\times 3^{-4}+\cdots = 0.\overline{02}_3$.Similarly, decimal fractions can be converted by multiplying by $3$, taking the result mod 3, then multiplying the remainder by $3$, and then taking mod 3 of that... etc. until the remainder is 0: $\begin{aligned} 0.25 \times 3 = \mathbf{0} + 0.75,\ \ c_1 &= 0\\ 0.75 \times 3 = \mathbf{2} + 0.25,\ \ c_2 &= 2\\ 0.25 \times 3 = \mathbf{0}+ 0.75,\ \ c_3 &= 0\\ 0.75 \times 3 = \mathbf{2} + 0.25,\ \ c_4 &= 2\\ &\vdots\\ &\Rightarrow 0.0202..._3. \end{aligned}$
In base-3, some points have more than one notation:
$\frac13 \times 3 = \mathbf{1} + 0,\ c_1 = 1$. So $\frac13 = 0.1_3.$ But we also have the following: $\begin{aligned} \frac13 \times 3= 0.3333... \times 3 = \mathbf{0} + 0.9999...,\ c_1 &= 0\\ 0.9999... \times 3 = \mathbf{2} + 0.9999...,\ c_2 &= 2\\ 0.9999... \times 3 = \mathbf{2} + 0.9999...,\ c_3 &= 2\\ &\vdots \end{aligned}$ So, $\frac13 = 0.\overline{3} = 0.1_3 = 0.0\overline{2}_3.$ This sometimes means there is ambiguity in $\mathcal{C}$, for if $\mathcal{C}$ expansions only contain the digits 0 and 2, then $\mathcal{C}$ contains $0.0\overline{2}_3$ but not $0.1_3$ even though they are the same. In this case, it's said to contain $0.0\overline{2}_3 = \frac13,$ which implies it contains $0.1_3$ because of the multiple notations.
Suppose $x\in [0,1]$ contains only the digits 0 and 2 in its base-3 expansion. Let $x_n$ be the truncation of $x$ at $n$ places after the decimal point. For example, if $x = 0.020202\cdots_{3}$, then $x_1 = 0$, $x_2 = 0.02_{3}$, $x_4 = 0.0202_{3}$, etc. Certainly, the sequence $x_n$ converges to $x$ as $n\to\infty$. In particular, for every $n\ge 1$, we have $x_n \le x \le x_n + \frac{1}{3^n}.$ Note that the numbers in $[0,1]$
- whose base-3 expansions go on for exactly $n$ digits after the decimal point and
- which use only the digits 0 and 2
are precisely the left endpoints of the intervals whose union is $\mathcal{C}_n$. Thus, the interval $\left[x_n, x_n + \frac1{3^n}\right]$ is contained in $\mathcal{C}_n$. It follows that $x\in \mathcal{C}_n$ for all $n\ge 0$, and hence $x\in \mathcal{C}$. $_\square$
Conversely, suppose $x\in \mathcal{C}$. Then $x\in \mathcal{C}_n$ for all $n\ge 0$. Note that the numbers in $\mathcal{C}_n$ are precisely those whose $n^\text{th}$ truncation (i.e., the number obtained by taking only the first $n$ digits after the decimal point) uses only 0 and 2 as digits in base 3. It follows that every truncation of $x$ uses only 0 and 2 as digits. This implies $x$ uses only 0 and 2 in its base-3 expansion. $_\square$
From this theorem, the proofs of the following two properties follow:
$\mathcal{C}$ does not contain any subintervals of $[0,1]$.
Let $[a,b] \subset [0,1]$ be an arbitrary interval. Write $a = 0.a_1 a_2 a_3 \cdots_{3}$ and $b = 0. b_1 b_2 b_3 \cdots_{3}.$ If some $a_i$ equals 1, then by the previous theorem, we know $a\not \in \mathcal{C}$. Similarly, if some $b_i$ equals 1, we know $b\not\in \mathcal{C}$. Otherwise, suppose $k$ is the smallest index such that $a_k \neq b_k$. Our casework forces $a_k = 0$ and $b_k = 2$, so $0.a_1 a_2 a_3 \cdots a_{k-1} 1_{3} \in [a,b]$, and it follows that $[a,b] \not\subset \mathcal{C}$. $_\square$
$\mathcal{C}$ is uncountable.
Define a function $f: \mathcal{C} \to [0,1]$ as follows. If $x = 0. x_1 x_2 x_3 \cdots_{3}$ is an element in $\mathcal{C}$, set $f(x) = 0 . (x_1 / 2) (x_2 / 2) (x_3 / 2) \cdots _{2}.$ In other words, take the ternary expansion of $x$, replace every digit $2$ with $1$, and consider the result as a binary number.
This function is surjective, since any element $y = 0.y_1 y_2 y_3 \cdots_{2}$ of $[0,1]$ has $f\big(0. (2y_1) (2y_2) (2y_3) \cdots_{3}\big) = y$.
Now, suppose $\mathcal{C}$ is countable. Write $\mathcal{C} = \{c_1, c_2, c_3, \cdots\}$. For each $y\in [0,1]$, let $g(y)$ equal the index $j$ of an element $c_j$ with $f(c_j) = y$ (remark: in defining this function $g$, we are implicitly using the axiom of choice). Then, we may construct a sequence $\{d_i\}_{i\in \mathbb{N}}$ such that $d_{g(y)} = y$ for all $y \in [0,1]$. This constitutes a bijection between $[0,1]$ and $\mathbb{N}$, so it follows that $[0,1]$ is countable. Contradiction!
Hence, we conclude $\mathcal{C}$ is uncountable. $_\square$
Let $\mathcal{C} + \mathcal{C}$ denote the sumset of the Cantor set $\mathcal{C}$. That is, $\mathcal{C} + \mathcal{C} = \{x+y \, : \, x, y \in \mathcal{C}\}.$ Which of the following statements is true about $\mathcal{C} + \mathcal{C}$?
References
- 127 "rect", . Cantor ternary set, in seven iterations. Retrieved September 5th, 2016, from https://commons.wikimedia.org/wiki/File:Cantor_set_in_seven_iterations.svg