统计计算——假设检验

Lecture_9-hypothesis_Testing_with_R

library(tidyverse)
library(knitr)
library(dplyr)
library(palmerpenguins)
library(ggplot2)

Statistical inference

通过样本（sample）去估计总体（population）.

point estimate
interval estimate (比起点估计，多了一个精确度指标)
rejection of a hypothesis

Point estimate

Moment estimator

Let $X_1,...,X_n$ i.i.d. as samples of X.

\[\hat \mu = \bar X \]

Maximum likelihood estimator

\[L(\mu|\mathbb{X}) = p(X_1,...X_n| \mu, \sigma) \]

then take the devarient of $\frac{dL(\mu, \sigma)}{d\mu} = 0$ and $\frac{dL(\mu, \sigma)}{d\sigma} = 0$.

For normal distribution, $$ \hat\mu = \frac{1}{n}\sum_{i=1}^n X_i$$, and $$ \hat\sigma^2 = \frac{1}{n}\sum_{i=1}^n(X_i - \bar X)^2$$
注意：如果方差下面是n,那就是有偏的，所以需要修偏改成n-1。

Confidence Interval

Confidence interval
Note that $\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)$. If $\sigma^2$ is known, we have the standardized $\bar{X}$ follows the standard normal distribution:

\[Z=\frac{\bar{X}-\mu}{\sigma / \sqrt{n}} \sim N(0,1) \]

Note that

\[P\left(-z_{\alpha / 2} \leq Z \leqq z_{\alpha / 2}\right)=1-\alpha, \]

where $z_{\alpha / 2}$ is the $\alpha / 2$ upper quantile of $N(0,1)$. As a result,

\[P\left(\bar{X}-z_{\alpha / 2} \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{X}+z_{\alpha / 2} \frac{\sigma}{\sqrt{n}}\right)=1-\alpha, \]

which means that

\[\left[\bar{X}-z_{\alpha / 2} \frac{\sigma}{\sqrt{n}}, \bar{X}+z_{\alpha / 2} \frac{\sigma}{\sqrt{n}}\right] \]

is the $100(1-\alpha) \%$ confidence interval of the parameter $\mu$.

If $\sigma^2$ is unknown, we can use the sample sd to replace the population sd:

\[T=\frac{\bar{X}-\mu}{s / \sqrt{n}} \sim t_{n-1}, \]

where

\[s=\sqrt{\frac{1}{n-1} \sum_{i=1}^n\left(X_i-\bar{X}\right)^2} . \]

The $100(1-\alpha) \%$ confidence interval becomes

\[\left[\bar{X}-t_{\alpha / 2, n-1} \frac{s}{\sqrt{n}}, \bar{X}+t_{\alpha / 2, n-1} \frac{s}{\sqrt{n}}\right] \]

where $t_{\alpha / 2, n-1}$ is the $\alpha / 2$ upper quantile of $t_{n-1}$ distribution.

Find the pivotal quantity that follow a specific distribution.

$z_{\alpha/2}$可以通过q-distribution函数求出。

ci_normal <- function(x, sigma, alpha = 0.05){
  
  lower <-
  upper <-
  return c(lower, upper)
}
ci_normal(x, sigma = 7)

如果$\sigma$未知，可以使用t分布。

penguins <- drop_na(penguins)
x <- penguins$body_mass_g
xbar <- mean(x)
s <- sd(x)
n <- length(x)
alpha <- 0.05
t.quantile <- qt(alpha/2, df = n - 1, lower = FALSE)
ci.lower <- xbar - t.quantile * s / sqrt(n)
ci.upper <- xbar + t.quantile * s / sqrt(n)
c(ci.lower, ci.upper)

也可以把是否需要simga放在一个函数里。

ci <- function(x, sigma = NULL, alpha = 0.05){
  if(is.null(sigma)){
  
  } else {
  
  }
}

Hypothesis testing

$H_0: \mu = \mu_0$ versus $H_1: \mu \neq \mu_0$

Steps to do a hypothesis testing:

State the null hypothesis and alternative hypothesis.
Find an estimator of the parameter.
Derive the distribution of the estimator.
Based on the distribution of the estimator, find a test statistic whose distribution has no unknown parameters.
Calculate the p-value. This is the probability, under the null hypothesis, of
sampling a test statistic at least as extreme as that which was observed.
Select a significance level（显著性水平） α, a probability threshold below which the null hypothesis will be rejected, and then make your conclusion.

Solution.
Let $n=333$, and $X_1, \ldots, X_n$ be the body mass of the penguins. To test $\mu=\mu_0$ :

State the $H_0$ and $H_1$.
Find the estimator:

\[\bar{X}=\frac{1}{n} \sum_{i=1}^n X_i . \]

Derive the distribution of $\bar{X}$ :

\[\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right), \]

where $\mu$ is the population mean of $X$, and $\sigma^2$ is the population variance of $X$. As $\sigma^2$ is unknown, we can use $s^2$, the sample variance, to replace $\sigma^2$. As a result, if $H_0$ is true,

\[T=\frac{\bar{X}-\mu_0}{s / \sqrt{n}} \sim t_{n-1} . \]

Choose $T$ as our test statistic. Given that we have observed the sample, we can calculate the value of $T$ by

\[t=\frac{\bar{x}-\mu_0}{s / \sqrt{n}}=\frac{4207.057-4300.000}{805.216 / \sqrt{333}}=-2.106 . \]

Find the $p$-value.
Note that the $p$-value is the probability of sampling a test statistic at least as extreme as that which was observed. The more extreme event is that we observe a $t$-value whose absolute value is larger.

\[p \text {-value }=P(|T| \geq|-2.106|)=P(T>2.106)+P(T<-2.106)=0.036 . \]

Select a significance level $\alpha=0.05$, and we reject $H_0$ at the significance level $0.05$ because $p$-value is less than $0.05$.

p值：在原假设成立的情况下，观测到的点足够极端的概率。

如果$H_0$不成立，则4中$\bar x$的值不是很大，就是很小，会导致t值不是很大，就是很小。

pt(-2.106, df = 333 - 1, lower = TRUE)
pt( 2.106, df = 333 - 1, lower = FALSE)

注意有两种备择假设，单边的或者双边的。

如果是大于的情况。
$P(T > t) = P(T > -2.106) = 0.9972$ 就不会拒绝了。

t_test <- function(x, mu0, alternative){
  if(!is.vector(x) | length(x) < 2){
    stop("please input a vector with length > 1")
  }
  if(!(alternative %in% c('two.sided', 'greater', 'lower'))){
    stop("")
  }
  if(alternative == 'two.sided'){
    n <- length(x)
    xbar <- mean(x)
    s <- sd(x)
    t_value <- (xbar - mu0)/(s / sqrt(n))
    p_value <- 2 * pt(abs(t_value), df = n - 1, lower = FALSE)
  }
  return(p_value)
}

penguins <- drop_na(penguins)
t_test(1:10, 4, "two.sided")

Rejection region

先把标准(拒绝域 rejection region)算出来。

t.test

t.test(x,
       alternative = c("two.sided", "less", "greater"),
       mu = 0,
       conf.level = 0.95
       )

myt <- with(penguins, t.test(body_mass_g, mu = 4300))
# with 的作用就是body_mass_g是penguins是它的一列

myt$p.value

Likelihood ratio test: how to construct a test statistic

Assume that a random sample $\mathbf{X}=\left\{X_1, \cdots, X_n\right\}$ of size $n$ is available, consider the generic testing problem

\[H_0: \theta \in \Theta_0 \text { v.s. } H_1: \theta \in \Theta_1, \]

construct the test statistic

\[\lambda(\mathbf{X})=\frac{\sup _{\theta \in \Theta_0} L(\theta \mid \mathbf{X})}{\sup _{\theta \in \Theta} L(\theta \mid \mathbf{X})}, \]

where $L(\theta \mid \mathbf{X})$ represents likelihood function of the sample $\mathbf{X}$. The Likelihood Ratio Test with nominal level $\alpha$ is to reject $H_0$ iff $\lambda(\mathrm{X})<c$, where $c$ is determined by the size restriction:

\[\mathrm{P}\left(\lambda(\mathrm{X})<c \mid \theta \in \Theta_0\right)=\alpha . \]

One sample test

Eg. One sample test, if $\sigma$ is known.

Let $X_1, \cdots, X_n$ be a random sample from $N\left(\mu, \sigma^2\right)$ population. Consider testing

\[H_0: \mu=\mu_0 \text { v.s. } H_1: \mu \neq \mu_0 \]

here we have

\[\Theta_0=\left\{\mu_0\right\} \quad \Theta_1=\left\{\mu_0\right\}^c . \]

Let $\theta=\mu$ be our parameter.
The numerator of $\lambda(\mathbf{X})$ is

\[\sup _{\theta \in \Theta_0} L\left(\mu \mid \mathbf{X}, \sigma^2\right)=\frac{1}{(\sigma \sqrt{2 \pi})^n} \exp \left\{-\frac{1}{2 \sigma^2} \sum_{i=1}^n\left(X_i-\mu_0\right)^2\right\} . \]

The denominator of $\lambda(\mathrm{X})$ is

\[\sup _{\sigma \in \Theta} L\left(\mu \mid \mathbf{X}, \sigma^2\right)=\frac{1}{(\sigma \sqrt{2 \pi})^n} \exp \left\{-\frac{1}{2 \sigma^2} \sum_{i=1}^n\left(X_i-\bar{X}\right)^2\right\} . \]

Thus the likelihood ratio becomes

\[\lambda(\mathbf{X})=\exp \left\{-\frac{n}{2 \sigma^2}\left(\bar{X}-\mu_0\right)^2\right\} . \]

We reject $H_0$ if $\lambda(\mathbf{X})$ is too small, or equivalently, if

\[\frac{\left(\bar{X}-\mu_0\right)^2}{\sigma^2 / n} \text { is too large. } \]

The final form of the likelihood ratio test is that $$\text{ reject } H_0 \text{ if } {|\frac{\bar X - \mu_0}{\sigma / \sqrt{n}}| > z_{\alpha/2}}$$, where $z_{\alpha/2}$ is the upper $\alpha / 2$ quantile of N(0,1).

Eg. One sample test, if $\sigma$ is unknown.

Let $\theta=\left(\mu, \sigma^2\right)$, and let

\[\Theta_0=\left\{\mu_0\right\} \times \mathbb{R}_{+}, \quad \Theta=\mathbb{R} \times \mathbb{R}_{+} . \]

Let

\[\hat{\sigma}_0^2=\frac{1}{n} \sum_{i=1}^n\left(X_i-\mu_0\right)^2, \quad \hat{\sigma}^2=\frac{1}{n} \sum_{i=1}^n\left(X_i-\bar{X}\right)^2 . \]

It can be shown that

\[\sup _{\theta \in \Theta_0} L(\theta \mid \mathbf{X})=\frac{1}{\left(\hat{\sigma}_0 \sqrt{2 \pi}\right)^n} \exp (-n / 2), \quad \sup _{\theta \in \Theta} L(\theta \mid \mathbf{X})=\frac{1}{(\hat{\sigma} \sqrt{2 \pi})^n} \exp (-n / 2) . \]

Example: One sample test, if $\sigma$ is unknown
The likelihood ratio is

\[\lambda(\mathrm{X})=\left(\frac{\hat{\sigma}^2}{\hat{\sigma}_0^2}\right)^{n / 2}=\left(1+\frac{n\left(\bar{X}-\mu_0\right)^2}{\sum_{i=1}^n\left(X_i-\bar{X}\right)^2}\right)^{-n / 2} . \]

We reject $H_0$ is $\lambda(\mathbf{X})$ is too small, or equivalently, if

\[\left|\frac{\sqrt{n}\left(\bar{X}-\mu_0\right)}{\sqrt{\frac{1}{n-1} \sum_{i=1}^n\left(X_i-\bar{X}\right)^2}}\right|=\left|\frac{\bar{X}-\mu_0}{s / \sqrt{n}}\right| \text { is too large. } \]

This is equivalent to the usual test statistic.

Two sample t test

Let $X_{11},...,X_{1n_1} \sim N(\mu_1, \sigma_1^2)$ and $X_{21},...,X_{2n_2} \sim N(\mu_2, \sigma_2^2)$, we want to know $H_0: \mu_1 = \mu_2, H_1: \mu_1 \neq \mu_2$

H0 and H1
Estimators and their distributions, or likelihood ratio
test statistic and its distribution
p-value

区分两种情况：

$\sigma_1 = \sigma_2$

思路：
估计值$\bar X_1 - \bar X_2$
$\bar X_1 \sim N(\mu_1, \sigma_1^2/n_1)$
$\bar X_2 \sim N(\mu_2, \sigma_2^2/n_2)$
此处我们假设$\sigma_1 = \sigma_2$
所以$\bar X_1- \bar X_2 \sim N(\mu_1 - \mu_2, \frac{\sigma^2}{n_1}+\frac{\sigma^2}{n_2})$
所以 $\frac{\bar X_1 - \bar X_2}{\sigma\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}$
但是我们不知道 $\sigma$.
所以我们用总样本($X_1$和$X_2$)的无偏方差来估计$\sigma$(所以均值就是)
所以就是 $\frac{1}{n_1+n_2+1} \sum_{i=1}^{n_1}$

If $\sigma_1=\sigma_2$
The test statistic (2-sample t-statistic) is

\[T=\frac{X_1-\bar{X}_2}{S_p \sqrt{1 / n_1+1 / n_2}} ; \]

where

\[ \begin{aligned} &X_j=\frac{1}{n_j} \sum_{i=1}^{n_j} X_{j i}, \quad j=1,2, \\ &S_j^2=\frac{1}{n_j-1} \sum_{i=1}^{n_j}\left(X_{j i}-\bar{X}_j\right)^2, \quad j=1,2, \\ &S_p^2=\frac{\left(n_1-1\right) S_1^2+\left(n_2-1\right) S_2^2}{n_1+n_2-2}, \end{aligned} \]

The distribution of $T$ is

\[T \sim t_{n_1+n_2-2} . \]

$\sigma_1 \neq \sigma_2$

思路：现在两个sigma不一样了，所以要分开估计$\hat\sigma_1^2$,所以$\bar X_1 - \bar X_2$ 是服从$N(\mu_1 - \mu_2, \frac{\hat\sigma_1^2}{n_1} - \frac{\hat\sigma_2^2}{n_2})$

If $\sigma_1 \neq \sigma_2$
The test statistic (the Welch's t-statistic) is

\[T=\frac{\tilde{X}_1-\bar{X}_2}{\sqrt{S_1^2 / n_1+S_2^2 / n_2}}, \]

and its distribution is

\[T \sim t_\nu, \quad \hat \nu=\frac{\left(S_1^2+S_2^2\right)^2}{S_1^4 /\left(n_1-1\right)+S_2^4 /\left(n_2-1\right)} . \]

test_means <- function(x,y, var.equal = FALSE, alternative){
  n1 <- length(x)
  n2 <- length(y)
  mean_x <-  mean(x)
  mean_y <-  mean(y)
  if(n1 < 1 || n2 < 1 || n1+n2 < 3){
    stop("x and y is not long enough")
  }
  if(var.equal==TRUE){
    sp_square <- (n1-1)*var(x) + (n2-1)*var(y) / (n1+n2-2)
    t_value <- (mean_x - mean_y) / sqrt(sp_square * (1/n1+1/n2))
    df <- n1+n2-2
  }  else {
    t.value = (mean_x - mean_y) / sqrt(var(x)/n1 + var(y)/n2)
    df <- n1+n2-2###not sure
  }
  p_value <- 0
  if(alternative -- 'two-sided'){
    p_value <- 2* pt(abs(t_value), df = df, lower = FALSE)
  }
  return(p_value)
}

Paired Data

sample sizes are equal
samples are not independent

Paired T test
$D_i = X_{1i} - X_{2i}$, 所以想要检测$X_1$的均值和$X_2$的均值的是否相等，可以转化为检验$\mu_D = 0$。就转化成了一个单样本的检验。
假设$D \sim N(\mu_D, \sigma_D^2)$

library(MASS)
data(UScrime)
qqnorm(UScrime$Po1)

在做t检验之间，应该要检测变量是否服从正态分布。

Several ways to test normality

shapiro.test()

检验变量是否服从正态分布，原假设一般是是正态，备择假设一般是不是正态。

nortest::ad.test()

ks.test()

Nonparametric tests

Wilcoxon Signed-Rank Test

既可以用做单样本，也可以用作双样本。

One Sample Case

Let $Y_1, · · · , Y_n$ be i.i.d. random variables with the common absolutely
continuous distribution F.

F: symmetric.

we want to test:$$H_0: \mu = 0, \text{ v.s. } H_1: \mu \neq 0$$

Let $R_1, . . . , R_n$ be the rank of
$|Y_1 - \mu_0|, ..., |Y_n-\mu_0|$.

Define $$\Psi_i=
\begin{cases}
0& \text{if } Y_i-\mu_0 < 0\
1& \text{if } Y_i-\mu_0 \ge 0
\end{cases}$$

the Wilcoxon’s signed rank statistic is defined by $W = \sum_{i=1}^n R_i\Psi_i$

y <- c(3,5,9,10,4)
mu0 <- 5
(y.cen.abs <- abs(y-mu0))
(r <- rank(y.cen.abs))
(psi <- (y > mu0))
(w <- sum(r*psi))

A large sample Wilcoxon signed rank test

Theorem.
When $H_0:\mu=\mu_0$ is true, we have $$E(W) = \frac{n(n+1)}{4}, \quad Var(W)=\frac{n(n+1)(2n+1)}{24}$$
Also, for $n\ge 12$, the distribution of $$Z = \frac{W - \frac{n(n+1)}{4}}{\sqrt{Var(W)=\frac{n(n+1)(2n+1)}{24}}} \rightarrow N(0,1)$$

Let $w$ be the signed rank statistic based on $n$ independent observations, each drawn from a continuous and symmetric pdf, where $n \geq 12$. Let

\[z=\frac{w-\frac{n(n+1)}{4}}{\sqrt{\frac{n(n+1)(2 n+1)}{24}}} . \]

To test $H_0: \mu=\mu_0$ and $H_1: \mu>\mu_0$ at the $\alpha$ level of significance, we reject $H_0$ if $z>z_\alpha$.
To test $H_0: \mu=\mu_0$ and $H_1: \mu<\mu_0$ at the $\alpha$ level of significance, we reject $H_0$ if $z<-z_0$.
To test $H_0: \mu=\mu_0$ and $H_1: \mu \neq \mu_0$ at the $\alpha$ level of significance, we reject $H_0$ if $z>z_{\alpha / 2}$ or $z<-z_{\alpha / 2}$.
The p-value can be calculated.

Eg.

y <- c(51, 53, 43, 36, 55, 55, 39, 43, 45, 27, 21, 26, 22, 43)
hist(y)
r <- rank(abs(y-28))
psi <- (y>28)
w <- sum(r * psi)
n <- length(y)
mean_w <- n * (n + 1) / 4
sd_w <- sqrt(n * (n + 1) * (2 * n + 1) / 24)
(z <- (w - mean_w)/sd_w)
(z_crit <- qnorm(0.95))
(p_value <- pnorm(z,lower.tail=FALSE))

wilcox.test(y,mu=28,alternative = "greater",exact=FALSE)$p.value

wilcox.test(y,mu=28,alternative = "greater",exact=FALSE,correct=FALSE)$p.value

Testing Paired data

A Wilcoxon signed rank test can also be used on paired data to test $H_0: \mu_D=0$, where $\mu_D=\mu_X-\mu_Y$.
Suppose we have a random sample of size $n,\left(x_i, y_i\right), 1 \leq i \leq n$, and let $d_i=x_i-y_i$.
Let $r_i$ be the rank of $\left|x_i-y_i\right|$ in the set

\[\left|x_1-y_1\right|,...,\left|x_n-y_n\right| \text {. } \]

Define $\psi_i=I\left(x_i>y_i\right)$
Let $w=\sum_{i=1}^n r_i \psi_i$

Eg.

x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29)
wilcox.test(x,y,paired=TRUE, alternative = "greater")

Testing two independent random samples*

Let $X_1 \ldots, X_n$ and $Y_1, \ldots, Y_m$ be two independent random samples of sizes $n$ and $m$ from $f_X$ and $f_Y$, respectively.
For $1 \leq i \leq n+m$, let $R_i$ be the rank of the $i$ th observation in the combined sample, and let

$\Psi_i= \begin{cases}1 & \text { if the } i \text { th observation came from } f_X, \\ 0 & \text { if the } i \text { th observation came from } f_Y .\end{cases}$
Define

\[W=\sum_{i=1}^{n+m} R_i \Psi_i \]

We have

\[E(W)=\frac{n(n+m+1)}{2}, \quad \operatorname{Var}(W)=\frac{n m(n+m+1)}{12} . \]

\[ Z=\frac{W-m(n+m+1) / 2}{\sqrt{n m(n+m+1) / 12}} \rightarrow N(0,1) $$ if $n>10$ and $m>10$. Eg. ```{r} x <- c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46) y <- c(1.15, 0.88, 0.90, 0.74, 1.21) wilcox.test(x, y, alternative = "greater", exact = FALSE, correct = FALSE) ``` ## Monte Carlo Methods for hypothesis tests 这一part的目的就是评价各种test的好坏，指标是power。我们固定alpha=typeI-error=0.05, 但是不同的test会有不同的分布，所以typeIIerror会不一样，所以power会不一样。power越大越好。 ### Type-I error and Type-II error - Type I error is that we reject $H_0$ given that $H_0$ is true. - Type II error is that we fail to reject $H_0$ given that $H_0$ is not true. - power of a test: we reject $H_0$ given that $H_0$ is not true.即1-typeII error.所以在固定typeI error的时候，power越大越好。 Eg. In the one-sample t-test, we assume that $X_1, . . . , X_n \sim N(\mu, \sigma^2)$. Consider $H_0 : \mu = \mu_0, H_1 : \mu > \mu_0.$ The rejection region is: $\bar x \ge \mu_0 + t_{\alpha,n-1}\frac{s}{\sqrt{n}}$What is the type-I error? When $\mu = \mu_0$, $\bar x \ge \mu_0 + t_{\alpha,n-1}\frac{s}{\sqrt{n}}$ 注意它的横坐标是$\bar x$ ![fd](typeI-error.png) 和上面的相比，We have known that the type-I error is $\alpha$ if the true distribution of $X$ is $N(\mu, \sigma^2)$. Eg2. Assume that $X_1, . . . , X_n \sim U(\mu − 1, \mu + 1)$, and we want to test $H_0 : \mu = 1, H_1 : \mu > 1.$ 所以我们想算：$$P(\frac{\bar X - \mu_0}{s/\sqrt{n}} > t_{\alpha, n-1}| X_1,...,X_n \sim U[0,2]\]

利用中心极限定理，当n足够大时，$\bar X - \mu_0}{s/\sqrt{n}$ 服从$N(0,1)$, 即$$P(N(0,1) > t_{\alpha, n-1} | X_1,...,X_n \sim U[0,2]$$,此处，通过生成M组X，来选择最P最接近$\alpha$的一组X。

n <- 20
alpha <- 0.05
mu0 <- 1
M <- 10000
t_value <- numeric(M)
set.seed(308)
for (j in 1:M){
  x <- runif(n, mu0 - 1, mu0 + 1)
  t_value[j] <- (mean(x) - mu0)/(sd(x) /sqrt(n))
}
is.reject <- (t_value > qt(alpha, df = n - 1, lower = FALSE))
mean(is.reject)

Eg.How about we replace U(0, 2) by the exponential distribution?

n <- 20
alpha <- 0.05
mu0 <- 1
M <- 10000
t_value <- numeric(M)
set.seed(308)
for (j in 1:M){
  x <- rexp(n, mu0)
  t_value[j] <- (mean(x) - mu0)/(sd(x) / sqrt(n))
}
is.reject <- (t_value > qt(alpha, df = n - 1, lower = FALSE))
mean(is.reject)

type-I-error 小了，但是type-II-error就大了，所以power小了，不好。

在分布未知的情况下，如何去找分位点做假设？
用MC生成一堆样本，用样本0.05的分位点，去做假设。

下面用非参的方法去做检验。（用于真实分布不是正态的）

n <- 20
alpha <- 0.05
mu0 <- 1
M <- 10000
p_value <- numeric(M)
set.seed(308)
for (j in 1:M){
x <- runif(n, mu0 - 1, mu0 + 1)
p_value[j] <- wilcox.test(x, mu = 1, alternative = "greater")$p.value
}
is.reject <- (p_value < alpha)
mean(is.reject)

此处wilcox的效果也不太好，因为不满足所需的对称性。

Power of a test

posted @ 2022-10-26 19:42 爱吃番茄的玛丽亚阅读(51) 评论(0) 编辑收藏举报

刷新页面返回顶部

pny01

热爱可抵岁月漫长~