t-test in R
Refer to :
http://en.wikipedia.org/wiki/Student%27s_t-test
http://mathworld.wolfram.com/Pairedt-Test.html
One-sample t-test
A one-sample location test of whether the mean of a population has a value specified in a null hypothesis.
in which:
: the mean value of samples
: a specified value which is the mean of normal distribution
n: the number of samples
s: the standard deviation of samples
The degrees of freedom df used in this test are n − 1. Although the parent population does not need to be normally distributed, the distribution of the population of sample means, , is assumed to be normal. By the central limit theorem, if the sampling of the parent population is random then the sample means will be approximately normal.
Example 1:
> a=rnorm(30, mean = 0, sd = 1) > t.test(a)
Result: One Sample t-test data: a t = -0.3769, df = 29, p-value = 0.709 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -0.5070374 0.3492322 sample estimates: mean of x -0.07890263 Example 2: > b=rnorm(30, mean = 10, sd = 1) > t.test(b, mu = 10) Result: One Sample t-test data: b t = 0.6915, df = 29, p-value = 0.4947 alternative hypothesis: true mean is not equal to 10 95 percent confidence interval: 9.763909 10.477300 sample estimates: mean of x 10.1206
Independent two-sample t-test
Equal sample sizes, equal variance
Two Sample t-test
This test is only used when both:
- the two sample sizes (that is, the number, n, of participants of each group) are equal;
-
it can be assumed that the two distributions have the same variance.
where
: the grand standard deviation
,
: the unbiased estimators of the variances of the two samples
: the specific mean difference between two samples, indicated by mu
For significance testing, the degrees of freedom for this test is 2n − 2 where n is the number of participants in each group.
A test of the null hypothesis that the difference between two responses measured on the same statistical unit has a mean value of zero.
A test of whether the slope of a regression line differs significantly from 0.
Example:
> a=rnorm(30, mean = 0, sd = 1) > b=rnorm(30, mean = 10, sd = 1) > t.test(a,b,paired = FALSE, var.equal = TRUE)
# mu : indicate the difference of mean values of the two sample
Result:
Two Sample t-test
data: a and b
t = -43.0835, df = 58, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.04438 -10.06367
sample estimates:
mean of x mean of y
-0.4334221 10.1206047
parameters of t.test()
Size of samples |
Paired |
var.equal |
Type of result |
equal |
default(FALSE) |
Default(FALSE) |
Welch Two Sample t-test |
equal |
FALSE |
TRUE |
Two Sample t-test |
equal |
TRUE |
TRUE |
Paired t-test |
equal |
FALSE |
FALSE |
Welch Two Sample t-test |
equal |
TRUE |
FALSE |
Paired t-test |
CONCLUSION:
paired = TRUE: Paired t-test
paired = FALSE , var.equal = FALSE: Welch Two Sample t-test
paired = FALSE, var.equal = TRUE: Two Sample t-test
Unequal sample sizes, equal variance
Two Sample t-test
This test is used only when it can be assumed that the two distributions have the same variance.
the total sample size minus two (that is, n1 + n2 − 2) is the total number of degrees of freedom, which is used in significance testing.
(Equal sample sizes, equal variance discussed above) is a special condition.
Example:
> a=rnorm(30, mean = 0, sd = 1) > c=rnorm(50, mean = 0, sd = 1) > t.test(a,c,paired = FALSE, var.equal = TRUE)
Result:
Two Sample t-test
data: a and c
t = -1.4447, df = 78, p-value = 0.1525
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.7661759 0.1217896
sample estimates:
mean of x mean of y
-0.4334221 -0.1112289
Equal or Unequal sample sizes, unequal variances
This test, also known as Welch's t-test, is used only when the two population variances are not assumed to be equal (the two sample sizes may or may not be equal) and hence must be estimated separately. The t statistic to test whether the population means are different is calculated as:
For use in significance testing, the distribution of the test statistic is approximated as an ordinary Student's t distribution with the degrees of freedom calculated using
This is known as the Welch–Satterthwaite equation. The true distribution of the test statistic actually depends (slightly) on the two unknown population variances (see Behrens–Fisher problem).
Example 1:
> a=rnorm(30, mean = 0, sd = 1) > d=rnorm(30, mean = 0, sd = 0.01) > t.test(a,d)
Result:
Welch Two Sample t-test
data: a and d
t = -2.5091, df = 29.005,
p-value = 0.01794
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.78349422 -0.07981147
sample estimates:
mean of x mean of y
-0.433422065 -0.001769222 Example 2:
> t.test(c,d)
Result:
Welch Two Sample t-test
data: c and d
t = 0.7864, df = 49.01, p-value = 0.4354
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.2053747 0.4694490
sample estimates:
mean of x mean of y
0.1314145253 -0.0006225961
Dependent t-test for paired samples
This test is used when the samples are dependent; that is, when there is only one sample that has been tested twice (repeated measures) or when there are two samples that have been matched or "paired". This is an example of a paired difference test.
Paired t-test:
The degree of freedom used is n − 1.
> t.test(a,b,paired = TRUE, var.equal = TRUE)
Result:
Paired t-test
data: a and b
t = -38.1485, df = 29, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.11985 -9.98820
sample estimates:
mean of the differences
-10.55403
> t.test(a,b,paired = TRUE, var.equal = FALSE)
Result:
Paired t-test
data: a and b
t = -38.1485, df = 29, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.11985 -9.98820
sample estimates:
mean of the differences
-10.55403
# var.equal does not matter. They have the same results.