#title 비율에 대한 검정
[[TableOfContents]]

==== z 통계량 ====
 * 90% 신뢰구간에서 z=1.645
 * 95% 신뢰구간에서 z=1.960
 * 99% 신뢰구간에서 z=1.2.576

==== 빠른 오차 계산 ====
A사안에 대해 임의로 선택된 국민들 1000명 중에 300명이 찬성했다. 오차는?
 * 1/sqrt(1000) --> ±3.2%

==== modified wald method 로 신뢰구간 구하기 ====
 * z * sqrt(p * (1-p) / (n+z^2))
 * 위 예제에서는.. 1.96 * sqrt(0.3 * (1-0.3) / (1000+1.96^2))

==== 1-sample의 비율 검정 ====
A교실의 학생이 100명이 있다. 이중 오른손 잡이는 86명이다. 한국은 94%가 오른손 잡이다. 한국과 A교실의 학생들의 오른손 잡이 비율은 같나?
{{{
> prop.test(86,100,p=0.94)

	1-sample proportions test with continuity correction

data:  86 out of 100, null probability 0.94
X-squared = 9.9734, df = 1, p-value = 0.001588
alternative hypothesis: true p is not equal to 0.94
95 percent confidence interval:
 0.7728837 0.9185961
sample estimates:
   p 
0.86 
}}}
 * 유의수준 0.05에서 대립가설 채택.


참고: http://www.r-bloggers.com/one-proportion-z-test-in-r/
{{{
z.test <- function(x,n,p=NULL,conf.level=0.95,alternative="less") {
  ts.z <- NULL
  cint <- NULL
  p.val <- NULL
  phat <- x/n
  qhat <- 1 - phat
  # If you have p0 from the population or H0, use it.
  # Otherwise, use phat and qhat to find SE.phat:
  if(length(p) > 0) { 
    q <- 1-p
    SE.phat <- sqrt((p*q)/n) 
    ts.z <- (phat - p)/SE.phat
    p.val <- pnorm(ts.z)
    if(alternative=="two.sided") {
      p.val <- p.val * 2
    }
    if(alternative=="greater") {
      p.val <- 1 - p.val
    }
  } else {
    # If all you have is your sample, use phat to find
    # SE.phat, and don't run the hypothesis test:
    SE.phat <- sqrt((phat*qhat)/n)
  }
  cint <- phat + c( 
    -1*((qnorm(((1 - conf.level)/2) + conf.level))*SE.phat),
    ((qnorm(((1 - conf.level)/2) + conf.level))*SE.phat) )
  return(list(estimate=phat,ts.z=ts.z,p.val=p.val,cint=cint))
}

z.test(86,100,p=0.94)
}}}

{{{
> z.test(86,100,p=0.94)
$estimate
[1] 0.86

$ts.z
[1] -3.368608

$p.val
[1] 0.0003777444

$cint
[1] 0.8134534 0.9065466
}}}
==== n개의 집단 비율에 대한 검정 ====
A도시에서는 300명 중 100명이, B도시에서는 400명 중 170명이 D후보를 지지한다고 조사되었다. A도시와 B도시의 D후부 지지 비율이 같다고 할 수 있는가?

{{{
분자 <- c(100, 170)
분모 <- c(300, 400)
prop.test(분자, 분모)
}}}

{{{
> prop.test(분자, 분모)

	2-sample test for equality of proportions with continuity correction

data:  분자 out of 분모
X-squared = 5.6988, df = 1, p-value = 0.01698
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.16664176 -0.01669158
sample estimates:
   prop 1    prop 2 
0.3333333 0.4250000 
}}}

결과해석
 * 두 집단에서 어떤 사건에 대한 비율이 같다고 할 수 있는지에 대한 검정.
 * 가설
  * 귀무가설: 차이가 없다. 
  * 대립가설: 차이가 있다. --> 유의수준 0.05에서는 대립가설 지지, 유의수준 0.01에서는 대립가설 기각
 * 95% 신뢰구간: 0.4250000-0.16664176 ~ 0.4250000-0.01669158 = 0.2583582 ~ 0.4083084, 기준은 100/300

엑셀로 하면..
attachment:비율에대한검정/prop.test.excel.png
==== 발생율(Exact Poisson tests) ====
카운트 데이터에 대해..
{{{
> poisson.test(분자, 분모)

	Comparison of Poisson rates

data:  분자 time base: 분모
count1 = 100, expected count1 = 115.71, p-value = 0.05656
alternative hypothesis: true rate ratio is not equal to 1
95 percent confidence interval:
 0.6064139 1.0099403
sample estimates:
rate ratio 
 0.7843137 
}}}

1표본
{{{
> poisson.test(83, 100)

	Exact Poisson test

data:  83 time base: 100
number of events = 83, time base = 100, p-value = 0.09854
alternative hypothesis: true event rate is not equal to 1
95 percent confidence interval:
 0.6610904 1.0289099
sample estimates:
event rate 
      0.83 
}}}
 * 귀무가설: 모집단 발생률(λ)이 귀무 가설에서의 발생률과 같다.
 * 대립가설: 모집단 발생률(λ)이 귀무 가설에서의 발생률과 다르다.
 * 유의수준 0.05에서 귀무가설 지지