Metric Space

A metric space is a set equipped with a distance function, which provides a measure of distance between any two points in the set. The distance function, known as a metric, must satisfy a collection of axioms. One represents a metric space \(S\) with metric \(d\) as the pair \((S, d)\).

For example, \(\mathbb{R}^2\) is a metric space, equipped with the Euclidean distance function \(d_{E}: \mathbb{R}^2 \times \mathbb{R}^2 \to \mathbb{R}\) given by \[d_{E} \big((x_1, y_1), (x_2, y_2)\big) = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}. \] However, there are other metrics one can place on \(\mathbb{R}^2\); for instance, the taxicab distance function \[d_{T} \big((x_1, y_1), (x_2, y_2)\big) = |x_1 - x_2| + |y_1 - y_2|. \] The pairs \(\big(\mathbb{R}^2, d_{E}\big)\) and \(\big(\mathbb{R}^2, d_{T}\big)\) are both metric spaces.

Metric spaces are extremely important objects in real analysis and general topology.

Motivation

In calculus, there is a notion of convergence of sequences: a sequence \(\{x_n\}\) converges to \(x\) if \(x_n\) gets very close to \(x\) as \(n\) approaches infinity. In using the phrase "gets very close," one implicitly refers to a notion of distance, where two things are close if the distance between them is small. In the case of real numbers, the distance between \(x, y \in \mathbb{R}\) is given by the absolute value \(|x-y|\).

However, \(\mathbb{R}\) is not the only set for which there is a notion of distance between elements. For instance, the higher dimensional Euclidean spaces \(\mathbb{R}^n\) and the circle all have their own notions of distance. In \(\mathbb{R}^n\), the Euclidean distance between two points \(\mathbf{x} = (x_1, \cdots, x_n)\) and \(\mathbf{y} = (y_1, \cdots, y_n)\) is defined to be \[\| \mathbf{x} - \mathbf{y} \| = \sqrt { \sum_{i=1}^{n} (x_i - y_i)^2 }. \] On a circle, one notion of distance is arcwise distance, where the distance between two points is the length of the shorter arc they bound.

To study convergence of sequences in the multitude of distance-equipped objects that appear throughout mathematics, there are two possible approaches. One could define convergence separately for each object and work through many similar proofs over and over again. Or, one could define an abstract notion of "space with distance," work through the proofs once, and show that many objects are instances of this abstract notion. The second approach is much easier and more organized, so the concept of a metric space was born.

Definition

A metric space is a pair \((M, d),\) where \(M\) is a set and \(d\) is a function \(M \times M \to \mathbb{R}\) satisfying the following axioms:

For any \(x, y, z\in M\),

\(d(x,y) = d(y,x)\) (symmetry axiom: the distance between two points shouldn't depend on the order in which the points are input to the distance function);

\(d(x, y) = 0\) if and only if \(x= y\) (the only point that is zero distance from a given point is the given point itself);

\(d(x,z) \le d(x,y) + d(y,z)\) (an axiomatization of the triangle inequality).

If \(d\) satisfies these axioms, it is called a metric.

These axioms are intended to distill the most common properties one would expect from a metric. Already, one can see that these axioms imply results that are consistent with intuition about distances. For example, the axioms imply that the distance between two points is never negative.

If \((M, d)\) is a metric space and \(x,y \in M\), then \(d(x,y) \ge 0\).

By the triangle inequality, \(d(x,x) \le d(x,y) + d(y,x)\). We know \(d(x,x) = 0\) and \(d(x,y) = d(y,x)\), so this inequality implies \(2d(x,y) \ge 0\). Dividing this by two gives the desired result. \(_\square\)

Closed Sets and Continuity

In general topology, there are two common types of sets, open sets and closed sets. Intuitively, an open set is a set that does not contain its boundary; the endpoints of an interval are not contained in the interval. For instance, the open set \((0,1)\) contains an infinite number of points leading to \(0\), like \(\frac{1}{2},\frac{1}{4},\frac{1}{8},\frac{1}{100},\frac{1}{1000000}\), etc., but not the number \(0\) itself. In contrast, a closed set is bounded. But closed sets abstractly describe the notion of a "set that contains all points near it." In a metric space, we can measure nearness using the metric, so closed sets have a very intuitive definition. Working off this definition, one is able to define continuous functions in arbitrary metric spaces.

In what follows, assume \((M,d)\) is a metric space.

If \(A \subset M\) and \(x\in M\), the distance between \(A\) and \(x\) is defined to be \[d(x, A) = \inf_{y \in A} d(x,y),\] where \(\inf\) denotes the infimum, the largest number \(k\in \mathbb{R}\) for which \(d(x,y) \ge k\) for all \(y \in A\).

Consider a subset \(S \subset M\). The closure \(\overline{S}\) of \(S\) is \[\overline{S}:= \{y \in M \, : \, d(y, S) = 0 \}.\] Note that \(S \subset \overline{S}\) always, since \(d(y, S) = 0\) if \(y \in S\). The set \(S\) is called closed if \(S = \overline{S}\).

Intuitively, if a function \(f: X \to Y\) is continuous, it should map points that are near one another in \(X\) to points that are near one another in \(Y\). Consider a closed subset \(C \subset Y\), so that \(C\) contains all points near it. If \(d\big(x, f^{-1} (C)\big) = 0\), then \(x\) is near to \(f^{-1} (C)\), so \(f(x)\) is near to every \(f(y) \in C\) (by our intuitive understanding of continuity). But \(C\) contains all points near it, so \(f(x) \in C\), and hence \(x\in f^{-1} (C)\). This implies \(f^{-1} (C)\) is itself closed. One is therefore forced to make the following definition:

Let \(X\) and \(Y\) be metric spaces. A function \(f: X \to Y\) is called continuous if, for every closed subset \(C \subset Y\), the set \(f^{-1} (C) \subset X\) is closed in \(X\).

Convergence of Sequences

Again, let \((M,d)\) be a metric space, and suppose \(\{x_n\}\) is a sequence of points in \(M\). Intuitively, one should declare that the sequence \(\{x_n\}\) converges to some \(x\in M\) if, for very large \(n\), the point \(x_n\) is very close to \(x\). Thus, the following definition is quite natural (and is essentially a metric space rephrasing of the epsilon-delta definition of a limit):

The sequence \(\{x_n\} \subset M\) is said to converge to \(x\in M\) if, for every \(\epsilon > 0\), there exists an index \(N_{\epsilon} \in \mathbb{N}\) such that for all \(n \ge N_{\epsilon}\), the inequality \(d(x_n, x) < \epsilon\) holds. Equivalently, \(\{x_n\}\) converges to \(x\) if and only if \[\lim_{n\to\infty} d(x_n, x) = 0. \] In either case, the notation for convergence is given by \(x_n \to x\) or \[\lim_{n\to\infty} x_n = x.\]

For any convergent sequence \(x_n \to x\), the points \(x_n\) and \(x_m\) are very close for large \(m\) and \(n\), since both points are known to be close to \(x\). This lends itself to a fairly natural converse question. Suppose \(\{x_n\}\) is a sequence for which \(x_n\) and \(x_m\) are very close whenever \(m\) and \(n\) are very large. Must this sequence \(\{x_n\}\) converge?

A sequence \(\{x_n\} \subset M\) is called Cauchy if, for every \(\epsilon > 0\), there exists an index \(N_{\epsilon} \in \mathbb{N}\) such that whenever \(m, n \ge N_{\epsilon}\), the inequality \(d(x_m, x_n) < \epsilon\) holds. Equivalently, \(\{x_n\}\) is Cauchy if and only if \[\lim_{m\to\infty} \lim_{n\to\infty} d(x_m, x_n) = 0.\]

A metric space \(M\) is called complete if every Cauchy sequence in \(M\) converges.

Knowing whether or not a metric space is complete is very useful, and many common metric spaces are complete. For instance, \(\mathbb{R}\) is complete under the standard absolute value metric, although this is not so easy to prove.

\(\mathbb{R}\) is a complete metric space.

Lemma: Let \((M, d)\) be a metric space. Suppose \(\{x_n\} \subset M\) is a Cauchy sequence. If \(\{x_n\}\) has a convergent subsequence, then \(\{x_n\}\) converges.

Suppose \(\{x_{n_k}\} \subset \{x_n\}\) is a convergent subsequence, with \(x_{n_k} \to x\) as \(k\to\infty\). Choose \(\epsilon > 0\) and \(K\) such that \(k\ge K\) implies \(d(x_{n_k}, x) < \frac{\epsilon}2\).

Since \(\{x_n\}\) is Cauchy, we can also choose \(M\) such that \(m, n \ge M\) implies \(d(x_n, x_m) < \frac{\epsilon}2\).

Let \(J\) be an index such that \(k\ge J\) implies \(n_k \ge M\); this exists simply because \(\{n_k\}\) is a strictly increasing sequence of positive integers. Then, for \(k\ge \max(J, K)\) and \(n\ge M\), we have \[d(x_{n}, x) \le d(x_{n_k}, x_n) + d(x_{n_k}, x) < \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon.\] Thus, \(\lim_{n\to\infty} x_n = x\). \(_\square\)

Let \(\{x_n\} \subset \mathbb{R}\) be a Cauchy sequence. By the above lemma, it suffices to prove \(\{x_n\}\) has a convergent subsequence.

Note that \(\{x_n\}\) must be bounded, since it is Cauchy. By the Bolzano-Weierstrass property of \(\mathbb{R}\), it follows \(\{x_n\}\) has a convergent subsequence. \(_\square\)

The following problem concerns the Banach fixed point theorem, an abstract result about complete metric spaces. This result may appear obscure and uninteresting, but the payoff is actually glorious: one can use it to prove the existence of solutions to all ordinary differential equations!

Contents