# Multivariate Regression

**Multivariate Regression** is a method used to measure the degree at which more than one independent variable (**predictors**) and more than one dependent variable (**responses**), are linearly related. The method is broadly used to predict the behavior of the response variables associated to changes in the predictor variables, once a desired degree of relation has been established.

**Exploratory Question**: Can a supermarket owner maintain stock of water, ice cream, frozen foods, canned foods and meat as a function of temperature, tornado chance and gas price during tornado season in June?

From this question, several obvious assumptions can be drawn: If it is too hot, ice cream sales increase; If a tornado hits, water and canned foods sales increase while ice cream, frozen foods and meat will decrease; If gas prices increase, prices on all goods will increase. A mathematical model, based on multivariate regression analysis will address this and other more complicated questions.

## Simple Regression

The **Simple Regression** model, relates one predictor and one response.

Let $n$ observations be $(x_1,y_1),(x_2,y_2),\ldots ,(x_n,y_n)$ pairs of predictors and responses, such that $\epsilon_i\sim \mathcal{N}(0,\sigma^2)$ are i.i.d (independent and identically distributed). For fixed real numbers $\beta_0$ and $\beta_1$ (parameters), the

modelis as follows:$y_i=\beta_0+\beta_1 x_i + \epsilon_i$

The

fitted model(fitted to the given data) is as follows:$\hat y_i =\hat\beta_0+\hat\beta_1 x_i$

The estimated parameters are $\hat\beta_1=\frac{\sum{(x_i - \bar{x})(y_i - \bar{y})}}{\sum{(x_i - \bar{x})^2}}$ and $\hat\beta_0=\bar y - \hat\beta_1 \bar x$, such that $\bar x$ and $\bar y$ are the sample averages.

**Note**: In most applications, it is assumed that error terms are iid $\mathcal{N}(0,\sigma^2)$. In general the error terms are not assumed to follow a particular distribution, they are assumed to be $E(\epsilon_i)=0$, $Var(\epsilon_i)=\sigma^2$ and $Cov(\epsilon_i,\epsilon_j)=0$ for $i\neq j$, expected value, variance and covariance.

## Multiple Regression

The **Multiple Regression** model, relates more than one predictor and one response.

Let $\textbf{Y}$ be the $n\times 1$ response vector, $\textbf{X}$ be an $n\times (q+1)$ matrix such that all entries of the first column are $1's$, and $q$ predictors. Let $\boldsymbol{\epsilon}$ be an $n\times 1$ vector such that $\boldsymbol{\epsilon}_i\sim \mathcal{N}(0,\sigma^2)$ are i.i.d (independent and identically distributed), and $\boldsymbol{\beta}$ be an $(q+1)\times 1$ vector of fixed parameters. The model is as follows:

$\textbf{Y}=\textbf{X}\boldsymbol{\beta}+\boldsymbol{\epsilon}$

In detail notation we have:

$\begin{pmatrix} y_{1}\\ y_{2}\\ y_{3}\\ \vdots\\ y_{n} \end{pmatrix} = \begin{pmatrix} 1&x_{11}&x_{12}&\ldots&x_{1q}\\ 1&x_{21}&x_{22}&\ldots&x_{2q}\\ 1&x_{31}&x_{32}&\ldots&x_{3q}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 1&x_{n1}&x_{n2}&\ldots&x_{nq} \end{pmatrix} \begin{pmatrix} \beta_{0}\\ \beta_{1}\\ \beta_{2}\\ \vdots\\ \beta_{q} \end{pmatrix} +\begin{pmatrix} \epsilon_{1}\\ \epsilon_{2}\\ \epsilon_{3}\\ \vdots\\ \epsilon_{n} \end{pmatrix}$

## Multivariate Regression

The **Multivariate Regression** model, relates more than one predictor and more than one response.

Let $\textbf{Y}$ be the $n\times p$ response matrix, $\textbf{X}$ be an $n\times (q+1)$ matrix such that all entries of the first column are $1's$, and $q$ predictors. Let $\textbf{B}$ be an $(q+1)\times p$ matrix of fixed parameters, $\boldsymbol{\Xi}$ be an $n\times p$ matrix such that $\boldsymbol{\Xi}\sim \mathcal{N}(0,\boldsymbol{\Sigma})$ (multivariate normally distributed with covariance matrix $\boldsymbol{\Sigma}$). The model is as follows:

$\textbf{Y}=\textbf{X}\textbf{B}+\boldsymbol{\Xi}$

In detail notation we have:

$\begin{pmatrix} y_{11}&y_{12}&\ldots&y_{1p}\\ y_{21}&y_{22}&\ldots&y_{2p}\\ y_{31}&y_{32}&\ldots&y_{3p}\\ \vdots&\vdots&\ddots&\vdots\\ y_{n1}&y_{n2}&\ldots&y_{np}\\ \end{pmatrix} = \begin{pmatrix} 1&x_{11}&x_{12}&\ldots&x_{1q}\\ 1&x_{21}&x_{22}&\ldots&x_{2q}\\ 1&x_{31}&x_{32}&\ldots&x_{3q}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 1&x_{n1}&x_{n2}&\ldots&x_{nq} \end{pmatrix} \begin{pmatrix} \beta_{01}&\beta_{02}&\ldots&\beta_{0p}\\ \beta_{11}&\beta_{12}&\ldots&\beta_{1p}\\ \beta_{21}&\beta_{22}&\ldots&\beta_{2p}\\ \vdots&\vdots&\ddots&\vdots\\ \beta_{q1}&\beta_{q2}&\ldots&\beta_{qp}\\ \end{pmatrix} +\begin{pmatrix} \epsilon_{11}&\epsilon_{12}&\ldots&\epsilon_{1p}\\ \epsilon_{21}&\epsilon_{22}&\ldots&\epsilon_{2p}\\ \epsilon_{31}&\epsilon_{32}&\ldots&\epsilon_{3p}\\ \vdots&\vdots&\ddots&\vdots\\ \epsilon_{n1}&\epsilon_{n2}&\ldots&\epsilon_{np}\\ \end{pmatrix}$

The MLE and unbiased estimator for $\textbf{B}$ is called the least square estimator, denoted $\boldsymbol{\hat B}$:

$\boldsymbol{\hat B}=(\boldsymbol{X^T}\boldsymbol{X})^{-1}\boldsymbol{X^T}\boldsymbol{Y}$

This estimator minimizes $(\boldsymbol{Y} - \boldsymbol{X}\boldsymbol{\hat B})^T(\boldsymbol{Y} - \boldsymbol{X}\boldsymbol{\hat B})$.

The unbiased estimator for $\boldsymbol{\Sigma}$, denoted $\boldsymbol{\hat \Sigma}$:

$\boldsymbol{\hat \Sigma}=\frac{1}{n-q-1}(\boldsymbol{Y} - \boldsymbol{X}\boldsymbol{\hat B})^T(\boldsymbol{Y} - \boldsymbol{X}\boldsymbol{\hat B})$

Fitted ModelThe fitted (prediction) model given by $\boldsymbol{\hat B}$ is as follows:

$\boldsymbol{\hat Y}=\boldsymbol{X}\boldsymbol{\hat B}$

With predicted error $\boldsymbol{\hat \Xi}=\boldsymbol{Y}-\boldsymbol{\hat Y}$.

Sample Covariance and $r_{1}^{2}$The matrix of sample covariance, $\boldsymbol{S}$, is given by a block matrix such that $\boldsymbol{S_{yy}}$, $\boldsymbol{S_{xy}}$, $\boldsymbol{S_{yx}}$ and $\boldsymbol{S_{xx}}$, and has the following form:

$\boldsymbol{S}=\begin{pmatrix} \boldsymbol{S_{yy}}&\boldsymbol{S_{yx}}\\ \boldsymbol{S_{xy}}&\boldsymbol{S_{xx}} \end{pmatrix}$

A measure on the association of the variables of the model will be denoted $\boldsymbol{r_1^{2}}$, with a range between zero and one. This measure, $\boldsymbol{r_{1}^2}$, is the largest eigenvalue of the following matrix:

$\boldsymbol{S_{yy}^{-1}}\boldsymbol{S_{yx}}\boldsymbol{S_{xx}^{-1}}\boldsymbol{S_{xy}}$

**Cite as:**Multivariate Regression.

*Brilliant.org*. Retrieved from https://brilliant.org/wiki/multivariate-regression/