t-test in R

Refer to :
http://en.wikipedia.org/wiki/Student%27s_t-test

http://mathworld.wolfram.com/Pairedt-Test.html

 

One-sample t-test

A one-sample location test of whether the mean of a population has a value specified in a null hypothesis.

in which:

: the mean value of samples

: a specified value which is the mean of normal distribution

n: the number of samples

s: the standard deviation of samples

The degrees of freedom df used in this test are n − 1. Although the parent population does not need to be normally distributed, the distribution of the population of sample means, , is assumed to be normal. By the central limit theorem, if the sampling of the parent population is random then the sample means will be approximately normal.

Example 1:

> a=rnorm(30, mean = 0, sd = 1)

> t.test(a)
Result:

    One Sample t-test

data: a

t = -0.3769, df = 29, p-value = 0.709

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

-0.5070374 0.3492322

sample estimates:

mean of x

-0.07890263

Example 2:

> b=rnorm(30, mean = 10, sd = 1)

> t.test(b, mu = 10)

Result:

    One Sample t-test

data: b

t = 0.6915, df = 29, p-value = 0.4947

alternative hypothesis: true mean is not equal to 10

95 percent confidence interval:

9.763909 10.477300

sample estimates:

mean of x

10.1206

 

Independent two-sample t-test

  Equal sample sizes, equal variance

Two Sample t-test

This test is only used when both:

  • the two sample sizes (that is, the number, n, of participants of each group) are equal;
  • it can be assumed that the two distributions have the same variance.

    where

    : the grand standard deviation

    ,: the unbiased estimators of the variances of the two samples

    : the specific mean difference between two samples, indicated by mu

    For significance testing, the degrees of freedom for this test is 2n − 2 where n is the number of participants in each group.

A test of the null hypothesis that the difference between two responses measured on the same statistical unit has a mean value of zero.

A test of whether the slope of a regression line differs significantly from 0.

 

Example:

> a=rnorm(30, mean = 0, sd = 1)

> b=rnorm(30, mean = 10, sd = 1)

> t.test(a,b,paired = FALSE, var.equal = TRUE)

 

# mu : indicate the difference of mean values of the two sample

Result:

    Two Sample t-test

data: a and b

t = -43.0835, df = 58, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-11.04438 -10.06367

sample estimates:

mean of x mean of y

-0.4334221 10.1206047

parameters of t.test()

Size of samples

Paired

var.equal

Type of result

equal

default(FALSE)

Default(FALSE)

Welch Two Sample t-test

equal

FALSE

TRUE

Two Sample t-test

equal

TRUE

TRUE

Paired t-test

equal

FALSE

FALSE

Welch Two Sample t-test

equal

TRUE

FALSE

Paired t-test

CONCLUSION:

paired = TRUE: Paired t-test

paired = FALSE , var.equal = FALSE: Welch Two Sample t-test

paired = FALSE, var.equal = TRUE: Two Sample t-test

Unequal sample sizes, equal variance

Two Sample t-test

This test is used only when it can be assumed that the two distributions have the same variance.

the total sample size minus two (that is, n1 + n2 − 2) is the total number of degrees of freedom, which is used in significance testing.

(Equal sample sizes, equal variance discussed above) is a special condition.

Example:

> a=rnorm(30, mean = 0, sd = 1)

> c=rnorm(50, mean = 0, sd = 1)

> t.test(a,c,paired = FALSE, var.equal = TRUE)

 

Result:

    Two Sample t-test

data: a and c

t = -1.4447, df = 78, p-value = 0.1525

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.7661759 0.1217896

sample estimates:

mean of x mean of y

-0.4334221 -0.1112289

Equal or Unequal sample sizes, unequal variances

This test, also known as Welch's t-test, is used only when the two population variances are not assumed to be equal (the two sample sizes may or may not be equal) and hence must be estimated separately. The t statistic to test whether the population means are different is calculated as:

For use in significance testing, the distribution of the test statistic is approximated as an ordinary Student's t distribution with the degrees of freedom calculated using

This is known as the Welch–Satterthwaite equation. The true distribution of the test statistic actually depends (slightly) on the two unknown population variances (see Behrens–Fisher problem).

Example 1:

> a=rnorm(30, mean = 0, sd = 1)

> d=rnorm(30, mean = 0, sd = 0.01)

> t.test(a,d)

 

Result:

    Welch Two Sample t-test

data: a and d

t = -2.5091, df = 29.005,

p-value = 0.01794

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.78349422 -0.07981147

sample estimates:

mean of x mean of y

-0.433422065 -0.001769222 Example 2:

> t.test(c,d)

Result:

    Welch Two Sample t-test

data: c and d

t = 0.7864, df = 49.01, p-value = 0.4354

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.2053747 0.4694490

sample estimates:

mean of x mean of y

0.1314145253 -0.0006225961

Dependent t-test for paired samples

This test is used when the samples are dependent; that is, when there is only one sample that has been tested twice (repeated measures) or when there are two samples that have been matched or "paired". This is an example of a paired difference test.

Paired t-test:

The degree of freedom used is n − 1.

> t.test(a,b,paired = TRUE, var.equal = TRUE)

Result:

    Paired t-test

data: a and b

t = -38.1485, df = 29, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-11.11985 -9.98820

sample estimates:

mean of the differences

-10.55403

> t.test(a,b,paired = TRUE, var.equal = FALSE)

Result:

    Paired t-test

data: a and b

t = -38.1485, df = 29, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-11.11985 -9.98820

sample estimates:

mean of the differences

-10.55403

# var.equal does not matter. They have the same results.

posted @ 2014-06-12 11:52  此间漫步  阅读(760)  评论(0编辑  收藏  举报