Skip to main content

Mathematical Handbook

This handbook is a living document of mathematical tips and techniques that I encounter. The idea behind writing these down is first and foremost to give myself a list of possible angles of attack when once again an equation does not bow to intense staring alone.

Reverse Triangle Inequality

\begin{equation*} |x - y| \ge \left| |x| - |y| \right| \end{equation*}

Exponential Function

Performing a series expansion

\begin{equation*} \exp(x) = \sum_{k = 0}^\infty \frac{x^k}{k!} \end{equation*}

Just the first two terms establish the exponential function as an upper bound for \(\exp(x) \ge 1 + x\).

Factorial

The series expansion of the exponential functions yields a lower bound for the factorial of \(n \in \mathbb{N}\).

\begin{equation*} e^n = \sum_{k = 0}^\infty \frac{n^k}{k!} \ge \frac{n^n}{n!} \Rightarrow n! \ge \left( \frac{n}{e} \right)^n \end{equation*}

Basics

Binomial Theorem

Let \(x, y \in \mathbb{K}\) and \(n \in \mathbb{N}\).

\begin{equation*} (x + y)^n = \sum_{k = 0}^n \binom{n}{k} x^{n - k} y^k \end{equation*}

Multinomial Theorem

Generalizes the binomial theorem to an arbitrary number of variables. Let \(m, n \in \mathbb{N}\) and \(x_1, ..., x_m \in \mathbb{K}\).

\begin{equation*} \left( \sum_{l = 1}^{m} x_{l} \right)^{n} = \sum_{k_{1} + ... + k_{m} = n} \frac{n!}{k_{1}! \cdots k_{m}!} \prod_{j = 1}^{m} x_{j}^{k_{j}} \end{equation*}

Algebra

Let \(V\) be a vector space.

Generalized Binomial Formula

\begin{equation*} ||x + y||^2 = ||x||^2 + ||y||^2 + 2\langle x, y \rangle \end{equation*}

Completing the square

This is the neat trick to add and subtract a term and absorb one part into a square expression.

\begin{equation*} a^2 + 2ab = a^2 + 2ab + b^2 - b^2 = (a + b)^2 - b^2 \end{equation*}

The important thing to remember is that this also works in for higher-dimensional variables and quadratic forms. An example application of this comes up in the derivation of the posterior distribution of a variable with Gaussian prior.

\begin{equation*} x^{T}Ax + x^{T}Ay + y^{T}Ax = (x + y)^{T}A(x + y) - y^{T}Ay \end{equation*}

Cauchy-Schwarz Inequality

\begin{equation*} |\langle x, y \rangle|^2 \le \langle x, x \rangle \cdot \langle y, y \rangle \end{equation*}

or alternatively for an induced norm \(||x|| = \sqrt{\langle x, x \rangle}\)

\begin{equation*} |\langle x, y \rangle| \le ||x|| \cdot ||y|| \end{equation*}

Self-Adjoint Matrix from Vector

Let \(v \in \mathbb{C}^n\). Then the matrix

\begin{equation*} A = \begin{pmatrix} 0 & v^{*}\\ v & 0 \end{pmatrix} \end{equation*}

is self-adjoint and has operator norm \(||A|| = ||v||_2\). This can be useful to translate matrix inequalities to vectors for example.

Analysis

Taylor Series

Let \(f : \mathbb{R} \rightarrow \mathbb{R}\) be infinitely differentiable at some point \(a \in \mathbb{R}\). Then

\begin{equation*} f(x) = \sum_{k = 0}^\infty \frac{f^{(k)}(a)}{k!} (x - a)^k \end{equation*}

where the sum terms \(R_k\) from \(k + 1\) to \(\infty\) tend to \(0\) increasingly quickly, i.e. \(R_k(x) = o(|x - a|^k)\), for \(x \rightarrow a\).

Fubini's Theorem

If \(S := \int_{X \times Y} |f(x, y)| d(x, y) < \infty\) and some technicalities, then

\begin{equation*} \int_X \left( \int_Y f(x, y) \mathrm{d}y \right) \mathrm{d}x = \int_Y \left( \int_X f(x, y) \mathrm{d}x \right) \mathrm{d}y = S \end{equation*}

Lebesgue Spaces

Let \((\Omega, \mathcal{A}, \mu)\) be a measure space. Then the set

\begin{equation*} \mathcal{L}^p(\Omega, \mathcal{A}, \mu) = \left\{ f : \Omega \rightarrow \mathbb{K} : f~\mathrm{measurable}, \int_\Omega |f(x)|^p \mathrm{d}\mu(x) < \infty \right\} \end{equation*}

is a vector space for any \(p \in \mathrm{R}\) and \(\mathbb{K} \in \{ \mathbb{R}, \mathbb{C} \}\). The following defines a semi-norm on this set.

\begin{equation*} ||f||_p = \left( \int_\Omega |f(x)|^p \mathrm{d}\mu(x) \right)^{\frac{1}{p}} \end{equation*}

Jensen's Inequality

Let \(f\) be a convex function and \(g\) be an integrable function. For a probability space \((\Omega, \mathcal{A}, \mu)\) this inequality reads

\begin{equation*} f\left( \int_\Omega g~\mathrm{d}\mu \right) \le \int_\Omega f \circ g~\mathrm{d}\mu \end{equation*}

An important special case is the application to the expected value

\begin{equation*} \mathbb{E}(f(X)) \le f(\mathbb{E}(X)) \end{equation*}

Hölder's Inequality

For \(1 \le p, q \le \infty\) with \(\frac{1}{p} + \frac{1}{q} = 1\)

\begin{equation*} ||fg||_1 \le ||f||_p + ||g||_q \end{equation*}

An important special case is the Cauchy inequality on the discrete set \(\{1, ..., n\}\) with the counting measure.

\begin{equation*} \sum_{i = 1}^n |x_i y_i| \le \left( \sum_{i = 1}^n |x_i|^p \right)^{\frac{1}{p}} + \left( \sum_{i = 1}^n |y_i|^q \right)^{\frac{1}{q}} \end{equation*}

which in turn has the Cauchy-Schwarz inequality as a special case.

Minkowski's Inequality

\begin{equation*} ||f + g||_p \le ||f||_p + ||g||_p \end{equation*}

Fourier Transform

Let \(f \in \mathcal{L}_1(\mathbb{R}^n)\) be an integrable function. Its continuous Fourier transform \(\hat{f}\) is given by

\begin{equation*} \hat{f}(y) = \int_{\mathbb{R^n}} f(x) e^{-2\pi i \langle y, x \rangle}~\mathrm{d}x \end{equation*}

The inverse Fourier transform restores the original \(f\).

\begin{equation*} f(x) = \int_{\mathbb{R^n}} \hat{f}(y) e^{2\pi i \langle y, x \rangle}~\mathrm{d}y \end{equation*}

Discrete Fourier Transform

Let \(a \in \mathbb{C}^N\). Its discrete Fourier transform is given by the complex vector \(\hat{a} \in \mathbb{C}^N\) with entries

\begin{equation*} \hat{a}_k = \sum_{j = 0}^{N - 1} e^{-2\pi i \frac{j (k - 1)}{N}} \cdot a_j \end{equation*}

The original vector can be retrieved by the inverse discrete Fourier transform.

\begin{equation*} a_k = \frac{1}{N} \sum_{j = 0}^{N - 1} e^{2\pi i \frac{j (k - 1)}{N}} \cdot \hat{a}_j \end{equation*}

Probability Basics

Absolute Value

Just split it into two parts and estimate each separately.

\begin{equation*} \Pr\left( |X| \ge a \right) = \Pr\left( X \le a \right) + \Pr\left( X \ge a \right) \end{equation*}

Union bound

\begin{equation*} \Pr\left( \cup_i X_i \right) \le \sum_i \Pr(X_i) \end{equation*}