用到[multcomp] (http://cran.r-project.org/web/packages/multcomp/index.html)包,并非方差分析完全教程
安装&加载
install.packages ( "multcomp" )
library ( multcomp )
数据
data ( warpbreaks )
str ( warpbreaks )
## 'data.frame': 54 obs. of 3 variables:
## $ breaks : num 26 30 54 25 70 52 51 26 67 18 ...
## $ wool : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
## $ tension: Factor w/ 3 levels "L","M","H": 1 1 1 1 1 1 1 1 1 2 ...
单因素方差分析
单因素(tension)三水平(L,M,H)方差分析
数据分布
#### Shapiro-Wilk 正态性检验
with ( warpbreaks , tapply ( breaks , tension , shapiro.test ))
## $L
##
## Shapiro-Wilk normality test
##
## data: X[[1L]]
## W = 0.9058, p-value = 0.07265
##
##
## $M
##
## Shapiro-Wilk normality test
##
## data: X[[2L]]
## W = 0.9428, p-value = 0.3232
##
##
## $H
##
## Shapiro-Wilk normality test
##
## data: X[[3L]]
## W = 0.9191, p-value = 0.1247
条形图
with ( warpbreaks , tapply ( breaks , tension , function ( x ) hist ( x , freq = F )))
对比双因素结果
library ( xtable )
amod <- aov ( breaks ~ tension , data = warpbreaks )
anov.tb <- xtable ( summary ( amod ), caption = "方差分析表" )
print ( anov.tb , type = "html" , caption.placement = getOption ( "xtable.caption.placement" ,
"top" ))
<TABLE border=1>
方差分析表
Df Sum Sq Mean Sq F value Pr(> F)
tension 2 2034.26 1017.13 7.21 0.0018
Residuals 51 7198.56 141.15
</TABLE>
多重比较
Tukey’s multiple comparisons tests: The “Honestly Significantly Different” (HSD) test proposed by the
statistician John Tukey is based on what is called the “studentized
range distribution.”
M与L差异显著,H与L差异极显著,H与M差异不显著
multcomp包
tuk <- glht ( amod , linfct = mcp ( tension = "Tukey" ))
summary ( tuk )
##
## Simultaneous Tests for General Linear Hypotheses
##
## Multiple Comparisons of Means: Tukey Contrasts
##
##
## Fit: aov(formula = breaks ~ tension, data = warpbreaks)
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>|t|)
## M - L == 0 -10.00 3.96 -2.53 0.0384 *
## H - L == 0 -14.72 3.96 -3.72 0.0014 **
## H - M == 0 -4.72 3.96 -1.19 0.4631
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
基础包(stats)
( tuk.base <- TukeyHSD ( amod , "tension" ))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = breaks ~ tension, data = warpbreaks)
##
## $tension
## diff lwr upr p adj
## M-L -10.000 -19.56 -0.4402 0.0385
## H-L -14.722 -24.28 -5.1624 0.0014
## H-M -4.722 -14.28 4.8376 0.4631
置信区间图
multcomp包
plot ( print ( confint ( tuk )))
基础包(stats)
plot ( tuk.base )
残差分布
残差 vs 拟合值可以检查残差是否有趋势或残差是否满足一致性;正态Q-Q图检查残差是否满足正态性。
plot ( amod , which = 1 : 2 )
双因素方差分析
双因素(tension和wool)多水平(t:L,M,H; w:A,B)方差分析
数据分布
多因素方差分析中,数据要求满足因素水平交叉内的正态性,然而水平交叉内的自由度往往较小,直接的正态性检查不可操作,因此一般由残差的正态性来反映,见残差分布 。
条形图
with ( warpbreaks , for ( i in c ( "L" , "H" , "M" )) {
hist ( warpbreaks [ wool == "A" & tension == i , "breaks" ], xlab = paste0 ( "breaks A" ,
i ), main = NULL )
})
with ( warpbreaks , for ( i in c ( "L" , "H" , "M" )) {
hist ( warpbreaks [ wool == "B" & tension == i , "breaks" ], xlab = paste0 ( "breaks B" ,
i ), main = NULL )
})
比较单因素 ,双因素中tension的p值小一点点,wool不显著。
library ( xtable )
amod2 <- aov ( breaks ~ tension + wool , data = warpbreaks )
anov.tb <- xtable ( summary ( amod2 ), caption = "方差分析表" )
print ( anov.tb , type = "html" , caption.placement = getOption ( "xtable.caption.placement" ,
"top" ))
<TABLE border=1>
方差分析表
Df Sum Sq Mean Sq F value Pr(> F)
tension 2 2034.26 1017.13 7.54 0.0014
wool 1 450.67 450.67 3.34 0.0736
Residuals 50 6747.89 134.96
</TABLE>
多重比较
tension同样是M与L差异显著,H与L差异极显著,H与M差异不显著
multcomp包
tuk2 <- glht ( amod2 , linfct = mcp ( tension = "Tukey" ))
summary ( tuk2 )
##
## Simultaneous Tests for General Linear Hypotheses
##
## Multiple Comparisons of Means: Tukey Contrasts
##
##
## Fit: aov(formula = breaks ~ tension + wool, data = warpbreaks)
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>|t|)
## M - L == 0 -10.00 3.87 -2.58 0.0335 *
## H - L == 0 -14.72 3.87 -3.80 0.0011 **
## H - M == 0 -4.72 3.87 -1.22 0.4474
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
基础包(stats)
( tuk.base2 <- TukeyHSD ( amod2 , "tension" ))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = breaks ~ tension + wool, data = warpbreaks)
##
## $tension
## diff lwr upr p adj
## M-L -10.000 -19.35 -0.6466 0.0336
## H-L -14.722 -24.08 -5.3688 0.0011
## H-M -4.722 -14.08 4.6312 0.4474
置信区间图
multcomp包
plot ( print ( confint ( tuk2 )))
基础包(stats)
plot ( tuk.base2 )
残差 vs 拟合值可以检查残差是否有趋势或残差是否满足一致性;正态Q-Q图检查残差是否满足正态性。
plot ( amod2 , which = 1 : 2 )