Jensen's Inequality
Jensen's inequality is an inequality involving convexity of a function. We first make the following definitions:
A function is convex on an interval \(I\) if the segment between any two points taken on its graph \((\)in \(I)\) lies above the graph. An example of a convex function is \(f(x)=x^2\).
A function is concave on an interval \(I\) if the segment between any two points on the graph \((\)in \(I)\) lies below the graph. An example of a concave function is \(f(x)=-x^2\).
Contents
Statement of Jensen's Inequality
Now, Jensen's inequality is the following:
Jensen's Inequality
Let a real-valued function \(f\) be convex on the interval \(I\). Let \(x_1,...,x_n\in I\) and \(\omega_1,...,\omega_n\ge 0\). Then we have
\[\dfrac{\omega_1 f\left(x_1\right)+\omega_2 f\left(x_2\right)+\cdots+\omega_n f\left(x_n\right)}{\omega_1+\omega_2+\cdots+\omega_n} \ge f\left(\dfrac{\omega_1x_1+\omega_2x_2+\cdots+\omega_nx_n}{\omega_1+\omega_2+\cdots+\omega_n}\right).\]
If \(f\) is concave, the direction of inequality is flipped.
In particular if we take weights \(\omega_1=\omega_2=\cdots=\omega_n=1\), we get the inequality
\[\dfrac{f\left(x_1\right)+f\left(x_2\right)+\cdots+f\left(x_n\right)}{n} \ge f\left(\dfrac{x_1+x_2+\cdots+x_n}{n}\right). \ _\square\]
This proof is a little long. As it doesn't help much in the problem-solving aspect, you may skip it.
Note that we need only to prove the statement for convex functions. This is due to the following result:
If \(g\) is a concave function, then \(-g\) is a convex function.
We first prove the following statement :
\(P \iff Q,\) where \(P\) and \(Q\) are as follows:
\(P\) : \(\forall x_1,...,x_n\in I\) and \(\forall \omega_1,...,\omega_n\ge 0\) the following holds :
\[\dfrac{\omega_1 f\left(x_1\right)+\omega_2 f\left(x_2\right)+\cdots+\omega_n f\left(x_n\right)}{\omega_1+\omega_2+\cdots+\omega_n} \ge f\left(\dfrac{\omega_1x_1+\omega_2x_2+\cdots+\omega_nx_n}{\omega_1+\omega_2+\cdots+\omega_n}\right).\]
\(Q\) : \(\forall x_1,...,x_n\in I\) and \(\forall \lambda_1, \dots, \lambda_n \geq 0 \text{ and } \displaystyle \sum_{i=1}^n \lambda_i = 1 \) the following holds :
\[\sum_{i=1}^n \lambda_i f(x_i) \geq f \left(\sum_{i=1}^n \lambda_i x_i \right).\]
Proof:
Simply put \(\lambda_i = \dfrac{\omega_i}{\displaystyle \sum_{r=1}^n \omega_r},\) where \( i = 1,2, \dots n.\)
This means that we've reduced the problem to just showing that \(Q\) is true. To do this, we use induction.
The base case \(n = 1\) is trivially true.
The case \(n = 2\) is the definition of a convex function (the proof of this is left to the interested reader).
For \(n \geq 3 \), we shall assume that \(\lambda_n \in (0,1)\) \((\)if \(\lambda_n = 1,\) the assertion trivially holds, and if \(\lambda_n = 0,\) we appeal to the induction hypothesis).
Now, we proceed assuming that the cases \(k = 2\) and \(k = n-1\) are true :
\[\begin{align} f \left( \sum_{i = 1}^n \lambda_i x_i \right) &= f \left( (1-\lambda_n) \left(\sum_{i = 1}^{n-1} \mu_i x_i \right) + \lambda_n x_n \right)\qquad \left(\text{where }\ \mu_i = \frac{\lambda_i}{1-\lambda_n}, \ i = 1,2,\dots,n-1,\ \sum_{i=1}^{n-1} \mu_i = 1\right) \\ & \leq (1-\lambda_n)f \left(\sum_{i = 1}^{n-1} \mu_i x_i \right) + \lambda_n f(x_n)\\ & \leq \left( \sum_{i=1}^{n-1} (1-\lambda_n)\mu_i f(x_i) \right) + \lambda_n f(x_n)\\ & = \sum_{i = 1}^n \lambda_if(x_i). \end{align} \]
So, the proof is complete via induction. \(_\square\)
The last form is the form that is most frequently encountered and used in various Olympiad level problems.
How do we check if a function is convex or concave? We can't always plot the graph and check. The best (and often quicker) way is using calculus. A function \(f\) is convex on the interval \(I\) if and only if \(f''(x) \geq 0\) for all \(x\in I\) and concave if \(f''(x) \leq 0\) for all \(x\in I\). The precise statement is given below:
Let \(f: I \to \mathbb{R}\) be a twice-differentiable function.
Then
- \(f\) is convex on \(I\) if and only if \(f''(x) \geq 0 \ \forall x \in I;\)
- \(f\) is concave on \(I\) if and only if \(f''(x) \leq 0 \ \forall x \in I.\)
The proof of this can be found here.
For example, for \(f(x)=x^2\) we have \(f'(x)=2x,\) so \(f''(x)=2>0\) for all \(x\in\mathbb{R}\). So the function is convex everywhere on the graph. Similarly for \(f(x)=-x^2\) we have \(f''(x)=-2<0\) for all \(x\in\mathbb{R}\). So the function is concave everywhere on the graph.
Since we've been talking so much about \(f(x)=x^2\), let's apply Jensen's on it. We've already shown that it's convex everywhere. We choose reals \(x_1=1, x_2=2,...,x_n=n\). Applying Jensen's we get
\[\begin{align} \dfrac{f\left(x_1\right)+f\left(x_2\right)+\cdots+f\left(x_n\right)}{n} &\ge f\left(\dfrac{x_1+x_2+\cdots+x_n}{n}\right) \\ \dfrac{f\left(1\right)+f\left(2\right)+\cdots+f\left(n\right)}{n} &\ge f\left(\dfrac{1+2+\cdots+n}{n}\right) \\ \dfrac{1^2+2^2+\cdots+n^2}{n} &\ge f\left(\frac{\frac{n(n+1)}{2}}{n}\right) \\ \frac{\frac{n(n+1)(2n+1)}{6}}{n} &\ge f\left(\dfrac{n+1}{2}\right) \\ \dfrac{(n+1)(2n+1)}{6}&\ge \left(\dfrac{n+1}{2}\right)^2 \\ n &\ge 1 . \end{align}\]
Special Cases of Jensen's Inequality
AM-GM inequality (arithmetic mean-geometric mean inequality) is one of the special cases of Jensen's inequality:
\[\frac{\sum_{i=1}^n a_i}{n} \geq \sqrt[n]{\prod_{i=1}^n a_i}.\]
Consider the function \(f(x) = \log x \ \forall x > 0.\)
Observe that \(f^{\prime\prime}(x) = \frac{-1}{x^2} < 0\). Then \(f(x)\) turns out to be a concave function. \((\)Also notice that we can conclude \(f(x)\) is concave using the graph of \(\log x.)\) Thus, by Jensen's inequality we have
\[\begin{align} f\left(\frac{ \sum_{i=1}^n a_i}{n}\right) & \geq \frac{\sum_{i=1}^n f\left(a_i\right)}{n} \\ \Rightarrow \log \left(\frac{\sum_{i=1}^n a_i}{n}\right) & \geq \frac{\sum_{i=1}^n \log a_i }{n}. \end{align}\]
By the property of logarithms, \(\log x + \log y = \log xy\) and \(b\log a = \log a^b\). Therefore, we can simplify the terms on the RHS as
\[\begin{align} \log \left(\dfrac{\sum_{i=1}^n a_i}{n}\right) & \geq \dfrac{\sum_{i=1}^n \log a_i}{n} \\ & = \dfrac{\log a_1+\log a_2 + \cdots + \log a_n }{n} \\ & = \dfrac{\log \left(a_1 a_2 a_3 \cdots a_n\right)}{n} \\ & = \log \left(\prod_{i=1}^n a_i\right)^{\frac{1}{n}}. \end{align}\]
Taking antilogarithm, we get our desired result. \(_\square\)
Prove that for all \(n\in\mathbb{N},\)
\[\] \[\sqrt{1^2+1}+\sqrt{2^2+1}+\cdots +\sqrt{n^2+1}\ge \dfrac{n}{2}\sqrt{n^2+2n+5}.\]
Notice that on the left side \((\)denoted \(L)\) we have a sum of same terms with some different arguments, so we should think of applying Jensen's with the common form as a function. Define the function \(f(x)=\sqrt{x^2+1}.\) We can find the second derivative, but that's not needed if we observe that for \(x\to \infty\), we have \(f(x)=\sqrt{x^2+1}\to |x|\). So the graph should look like that of \(y=|x|\), V-shaped with a little curve at the bottom and thus convex.
Now we are ready to apply Jensen's with reals \(x_1=1,x_2=2,...,x_n=n\). We get
\[\begin{align} \dfrac{f\left(x_1\right)+f\left(x_2\right)+\cdots+f\left(x_n\right)}{n} &\ge f\left(\dfrac{x_1+x_2+\cdots+x_n}{n}\right) \\ \dfrac{f\left(1\right)+f\left(2\right)+\cdots+f\left(n\right)}{n} &\ge f\left(\dfrac{1+2+\cdots+n}{n}\right) \\ \dfrac{\sqrt{1^2+1}+\sqrt{2^2+1}+\cdots+\sqrt{n^2+1}}{n} &\ge f\left(\dfrac{n(n+1)/2}{n}\right) \\ &= f\left(\dfrac{n+1}{2}\right) \\ &= \sqrt{\left(\dfrac{n+1}{2}\right)^2+1} \\ &= \dfrac{1}{2}\sqrt{n^2+2n+5}. \end{align}\]
Hence, \(L\ge \dfrac{n}{2}\sqrt{n^2+2n+5},\) completing the proof. \(_\square\)
Applications of Jensen's Inequality
\[\dfrac{1}{a}+\dfrac{1}{b}+\dfrac{1}{c} = a+b+c\]
Let \(a,b,c\) be positive real numbers such that the above equation is satisfied. If the maximum value of the expression below is in the form of \( \frac {m}{n} ,\) where \(m,n\) are coprime positive integers, what is \(m+n?\)
\[\dfrac{1}{\left(2a+b+c\right)^2}+\dfrac{1}{\left(2b+c+a\right)^2}+\dfrac{1}{\left(2c+b+a\right)^2} \]