ANalysis Of VAriance (ANOVA) Appendix 1: Proofs

ANalysis Of VAriance (ANOVA) Appendix 1: Proofs

1 Noncentral Chi-square distribution

1 Noncentral Chi-square distribution with \(k\) degrees of freedom

If \(X_i \overset{i.i.d}{\sim} \mathcal{N}(\mu_i, \sigma_i^2), \forall i=1,2,\cdots,k\), then

\[Y = \sum_{i=1}^{k}\frac{X_i^2}{\sigma^2_i} \sim {\chi'}^2_k(\delta) \]

where \(k\) is the degree of freedom; \(\delta\) is non-centrality parameter and

\[\delta = \sum_{i=1}^{k}\frac{\mu_i^2}{\sigma^2_i} = \mathbb{E}[Y] - k \]

and \(\mathbb{E}[Y]\) is the expectation of the random variable \(Y\)

\[\mathbb{E}[Y] = k + \sum_{i=1}^{k}\frac{\mu_i^2}{\sigma^2_i} \]

2 Noncentral \(F\) distribution with \((v_1,v_2)\) degrees of freedom and noncentral parameter

If \(X_1\) follows a noncentral Chi-square distribution with the noncentral parameter \(\delta\), and \(X_2\) follows a Chi-square distribution. i.e., \(X_1 \sim {\chi'}^2_{k_1}(\delta)\) and \(X_1 \sim {\chi'}^2_{k_2}\), then the following random variable \(F\) follows a noncentral \(F\) distribution:

\[F = \frac{X_1/k_1}{X_2/k_2} \sim{F'}_{k_1, k_2}(\delta) \]

3 Type II error in ANOVA

\[\beta = \mathrm{Pr} \left(F_0 \leq F_{\alpha, a-1, N-a} \mid H_0 \text{ is false } \right) \]

where

\[F_0 = \frac{ MS_{\text {Treatments }}}{MS_E} = \frac{ \left(SS_{\text {Treatments }} / \sigma^2 \right) / (a-1)}{ \left(SS_E / \sigma^2 \right) / (N-a)} \quad \sim \quad {F'}_{a-1,N-a} \left(\frac{n}{\sigma^2} \sum_{i=1}^a \tau_i^2 \right) \]

4 Relation between \(\Phi^2\) and \(\delta\)

\[\delta = \Phi^2 \cdot a \]

2 Proofs

2.1 Fixed effect model

Proof of \(SS_T = SS_{\text{Treatments}} + SS_E\)

\[\begin{aligned} SS_T &= \sum_{i=1}^a \sum_{j=1}^n ( y_{i j}-\bar{y}_{i \cdot}+\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot} )^2 \\ &= \sum_{i=1}^a \sum_{j=1}^n \left[ ( \bar{y}_{i \cdot} - \bar{y}_{\cdot \cdot} )^2 + 2 (\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot} ) (y_{i j}-\bar{y}_{i \cdot} ) + (y_{i j}-\bar{y}_{i \cdot})^2 \right] \\ &= \underbrace{\sum_{i=1}^a \sum_{j=1}^n ( \bar{y}_{i \cdot} - \bar{y}_{\cdot \cdot} )^2 }_{SS_{\text{Treatments}}} + 2 \sum_{i=1}^a \sum_{j=1}^n \left[ (\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot}) (y_{i j} - \bar{y}_{i \cdot} ) \right] + \underbrace{ \sum_{i=1}^a \sum_{j=1}^n ( y_{i j} - \bar{y}_{i \cdot} )^2 }_{SS_E} \\ &= SS_{\text {Treatments }} + SS_E + 2 \sum_{i=1}^a \left\{ \left(\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot}\right) \times \left[\sum_{j=1}^n\left(y_{i j}-\bar{y}_{i \cdot}\right)\right] \right\} \\ &=S S_{\text {Treatments }}+ SS_E \qquad \text{because }\sum_{j=1}^n\left(y_{i j}-\bar{y}_{i \cdot}\right)=0 \end{aligned} \]

Proof of \(SS_{\text{Treatments}} / \sigma^2 \sim \chi^2_{a-1}\)

Proof: \(SS_{\text{Treatments}} / \sigma^2 \sim \chi^2_{a-1}\) if \(H_0\) is true (i.e., \(\tau_1 = \tau_2 = \cdots = \tau_a\) or \(\mu_1 = \mu_2 = \cdots = \mu_a\))

Assume \(H_0\) is true, we have \(y_{ij} \overset{i.i.d.}{\sim} \mathcal{N} \left(\mu, \sigma^2 \right) \). Thus, we have

\[\bar{y}_{i \cdot} = \frac{1}{n} \sum_{j=1}^{n} y_{ij} \quad \overset{i.i.d.}{\sim} \quad \mathcal{N} \left(\mu, \frac{\sigma^2}{n} \right) \]

Then let

\[z_i = \frac{\bar{y}_{i \cdot} - \mu}{\sqrt{\sigma^2 / n }} \quad \overset{i.i.d.}{\sim} \quad \mathcal{N}(0, 1) \]

Consider

\[\begin{aligned} SS_{\text {Treatmens }} &= n \sum_{i=1}^a (\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot} )^2 \\ &= n \sum_{i=1}^a (\bar{y}_{i \cdot} - \mu + \mu - \bar{y}_{\cdot \cdot} )^2 \\ &= n \left[ \sum_{i=1}^a (\bar{y}_{i \cdot}-\mu)^2 - 2 \sum_{i=1}^a [(\bar{y}_{i \cdot}-\mu) (\bar{y}_{\cdot \cdot} - \mu ) ] + \sum_{i=1}^a (\bar{y}_{\cdot \cdot} - \mu )^2\right] \\ &= n \left[ \sum_{i=1}^a (\bar{y}_{i \cdot} - \mu )^2-a (\bar{y}_{\cdot \cdot}-\mu )^2 \right] \\ &= n \sum_{i=1}^a\left(\bar{y}_{i \cdot}-\mu\right)^2-n a\left(\bar{y}_{\cdot \cdot}-\mu\right)^2 \end{aligned} \]

Thus,

\[\begin{aligned} & \frac{n }{\sigma^2} \sum_{i=1}^a (\bar{y}_{i \cdot}-\mu)^2 \\ =& \frac{n}{\sigma^2} \sum_{i=1}^a (\bar{y}_{i \cdot}-\bar{y}_{\cdot \cdot})^2 + \frac{n a}{\sigma^2} \left(\bar{y}_{\cdot \cdot}-\mu \right)^2 \\ =& \underbrace{ \frac{n}{\sigma^2} \sum_{i=1}^a \left[ (\bar{y}_{i \cdot} - \mu) - \frac{1}{a} \sum \limits_{i=1}^a (\bar{y}_{i \cdot} - \mu ) \right]^2}_{SS_{\text {Treatmens }} / \sigma^2} + \frac{n}{a \sigma^2} \left[\sum_{i=1}^a (\bar{y}_{i \cdot}-\mu ) \right]^2 \\ =& \underbrace{ \sum_{i=1}^{a} (z_i-\bar{z})^2}_{SS_{\text {Treatmens }} / \sigma^2} + \frac{1}{a} \left(\sum_{i=1}^a z_i \right)^2 = \sum_{i=1}^{a} z_i^2 \\ =& \left[z_1, z_2, \cdots, z_a\right] \times \left[\mathbf{I}_{a \times a} - \frac{1}{a} \mathbf{1}_{a \times a} \right] \times \left[ \begin{array}{l} z_1 \\ z_2 \\ \vdots \\ z_a \end{array} \right] + \left[z_1, z_2, \cdots, z_a\right] \times \left[\frac{1}{a} \mathbf{1}_{a \times a} \right] \times \left[\begin{array}{l} z_1 \\ z_2 \\ \vdots \\ z_a \end{array} \right] \end{aligned} \]

where \(\bar{z} = \frac{1}{a} \sum_{i=1}^{a} z_i\) and matrix \(\mathbf{I}_{a \times a}\) is a \(a \times a\) identity matrix; \(\mathbf{1}_{a \times a}\) is a \(a \times a\) all-ones matrix.

Thus \(\text{Rank}\left[\mathbf{I}_{a \times a} - \frac{1}{a} \mathbf{1}_{a \times a} \right] = a - 1\) and \(\text{Rank}\left[\frac{1}{a} \mathbf{1}_{a \times a} \right]=1\)

Then, according to the Cochran's theorem, we have

\[\frac{SS_{\text{Treatments }}}{\sigma^2} = \sum_{i=1}^a \left(z_i - \bar{z} \right)^2 \sim \chi_{a-1}^2 \]

Proof of \(SS_E/\sigma^2 \sim \chi^2_{N-a}\)

To be completed...

Proof of \(\mathbb{E}[MS_E]=\sigma^2\)

Proof: \(MS_E\) is an unbiased estimation of \(\sigma^2\), i.e.,

\[\mathbb{E}[MS_E]=\mathbb{E} \left[\frac{SS_E}{N-a} \right]=\sigma^2 \]

\[\begin{aligned} \mathbb{E} \left[ MS_E \right] &= \mathbb{E} \left[ \frac{S S_E}{N-a} \right] = \frac{1}{N-a} \cdot \mathbb{E} \left[\sum_{i=1}^a \sum_{j=1}^n \left(y_{i j} -\bar{y}_{i \cdot}\right)^2 \right] \\ & = \frac{1}{N-a} \cdot \mathbb{E} \left[ \sum_{i=1}^a \sum_{j=1}^n \left(y_{i j}^2-2 y_{i j} \bar{y}_{i \cdot} + \bar{y}_{i \cdot}^2 \right) \right] \\ & = \frac{1}{N-a} \cdot \mathbb{E} \left[ \sum_{i=1}^a \sum_{j=1}^n y_{i j}^2 - 2 n \sum_{i=1}^a \bar{y}_{i \cdot}^2 + n \sum_{i=1}^a \bar{y}_{i \cdot}^2 \right] \\ & = \frac{1}{N-a} \cdot \mathbb{E} \left[ \sum_{i=1}^a \sum_{j=1}^n y_{i j}^2 - \frac{1}{n} \sum_{i=1}^a y_{i \cdot}^2 \right] \\ & =\frac{1}{N-a} \cdot \mathbb{E} \left[ \sum_{i=1}^a \sum_{j=1}^n \left(\mu+\tau_i+\varepsilon_{i j}\right)^2 - \frac{1}{n} \sum_{i=1}^a \left(\sum_{j=1}^n \left(\mu + \tau_i + \varepsilon_{i j} \right) \right)^2 \right] \\ & = \frac{1}{N-a} \cdot \mathbb{E} \left[n \sum_{i=1}^a \left(\mu + \tau_i \right)^2 + N \sigma^2 - n \sum_{i=1}^a \left(\mu + \tau_i \right)^2 - a \sigma^2 \right] \\ & = \sigma^2 \end{aligned} \]

Proof of \(\mathbb{E}[MS_{\text{Treatmeats}}]=\sigma^2+\frac{n}{a-1}\sum_{i=1}^{a}\tau_i^2\)

Proof: Under the assumption \(\sum_{i=1}^{a}\tau_i=0\), we have

\[\mathbb{E}[MS_{\text{Treatmeats}}] = \sigma^2+\frac{n}{a-1}\sum_{i=1}^{a}\tau_i^2 \]

First, we have

\[\begin{split} MS_{\text {Treatments }} &= \frac{n}{a-1} \sum_{i=1}^a\left(\bar{y}_{i \cdot} - \bar{y}_{\cdot \cdot}\right)^2 \\ &= \frac{n}{a-1} \left[ \sum_{i=1}^a \bar{y}_{i \cdot}^2 - 2 \sum_{i=1}^a \bar{y}_{i \cdot} \, \bar{y}_{\cdot \cdot} + \sum_{i=1}^a \bar{y}_{\cdot \cdot}^2 \right] \\ &=\frac{n}{a-1} \left[ \sum_{i=1}^a \bar{y}_{i \cdot}^2- a \bar{y}_{\cdot \cdot}^{2} \right] \end{split} \]

Then,

\[\begin{split} \mathbb{E} \left[ M S_{\text {Treatment }} \right] &= \frac{n}{a-1} \ \mathbb{E} \left[ \sum_{i=1}^a \bar{y}_{i \cdot}^2-a \bar{y}_{\cdot \cdot}^2 \right] \\ &= \frac{n}{a-1} \left( \sum_{i=1}^a \mathbb{E} \left[ \bar{y}_{i \cdot}^2 \right] - a \, \mathbb{E} \left[ \bar{y}_{\cdot \cdot}^2 \right] \right) \\ &= \frac{n}{a-1} \left[ \sum_{i=1}^a \left( \mathbb{E} \left[\bar{y}_{i \cdot}\right]^2 + \mathbb{D} \left[ \bar{y}_{i \cdot} \right] \right) - a \left( \mathbb{E} \left[ \bar{y}_{\cdot \cdot}\right]^2 + \mathbb{D} \left[ \bar{y}_{\cdot \cdot} \right] \right) \right] \\ & = \frac{n}{a-1} \left\{ \sum_{i=1}^a \left[ \left(\mu+\tau_i\right)^2 + \frac{\sigma^2}{n} \right] - a \left[ \left( \mu + \frac{ \sum_{i=1}^a \tau_i}{a}\right)^2 + \frac{\sigma^2}{a \cdot n} \right] \right\} \\ & = \sigma^2 + \frac{n}{a-1} \left[ \sum_{i=1}^a \left( \mu+\tau_i\right)^2-a \mu^2 \right] \\ & = \sigma^2 + \frac{n}{a-1} \left( \sum_{i=1}^a \tau_i^2+2 \mu \sum_{i=1}^a \tau_i \right) \\ & = \sigma^2 + \frac{n}{a-1} \sum \limits_{i=1}^a \tau_i^2 \end{split} \]

2.2 Random Effect Model

posted @ 2022-10-22 16:27  veager  阅读(15)  评论(0编辑  收藏  举报