(forest) 螻, 賀 覓(tree)れ . 蠍一 覓企 蟆一碁Μ. input 一危磯 random願, 蛾 random企. 賀 覓企れ蟆 random input 伎 螳 覓企れ 覬企企 蟆郁骸襯 voting(れ蟆一 豺)伎 覿襯. 一危一 螻殊朱 ろ螻, 襷 覲襯 伎企 覲 蟇 ろ 螳 ク企. unbalanced class 覈讌 襷. -- R 伎 觜一危 覿, 蟾蟆渚 谿瑚
EXCEL 譟一覦覯 覦 糾覿(http://www.kyobobook.co.kr/product/detailViewKor.laf?ejkGb=KOR&mallGb=KOR&barcode=9788983257000&orderClick=LAG&Kc=SETLBkserp11_15)襯 伎.
cname <- c("ID", "蟲襷る", "磯","碁一", "碁", "覦覓碁", "蟇一朱")
x = read.table("c:\\data\\disc.txt", col.names = cname)
head(x)
disc.txt
> head(x)
ID 蟲襷る 磯 碁一 碁 覦覓碁 蟇一朱
1 1 A 48 9000 4 5 6
2 2 A 58 8000 6 4 20
3 3 A 52 7000 6 4 12
4 4 A 63 7000 6 4 15
5 5 A 59 8000 4 6 6
6 6 A 38 11000 5 4 10
>
tree <- randomForest(蟲襷る ~ 磯 + 碁一 + 碁 + 覦覓碁 + 蟇一朱, data=x)
print(tree) # view results
importance(tree)
蟆郁骸
> print(tree) # view results
Call:
randomForest(formula = 蟲襷る ~ 磯 + 碁一 + 碁 + 覦覓碁 + 蟇一朱, data = x)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 5%
Confusion matrix:
A B class.error
A 10 0 0.0
B 1 9 0.1
> importance(tree)
MeanDecreaseGini
磯 2.624620
碁一 1.815804
碁 1.263035
覦覓碁 1.196015
蟇一朱 2.576659
>
蟲襷る襯 蟆一 覲 譴 磯 > 蟇一朱 > 碁一 > 碁 > 覦覓碁 企.
rf <- randomForest(factor(t3)~diff_cnt+diff_time, data=x6, type="classification", importance=TRUE,na.action=na.omit)
pred <- predict(rf, newdata=test)
table(pred, test$t3)
data(iris)
set.seed(111)
ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2))
iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,])
iris.pred <- predict(iris.rf, iris[ind == 2,])
table(observed = iris[ind==2, "Species"], predicted = iris.pred)
襦..
install.packages("rpart")
library("rpart")
cf <- cforest(Species ~ ., data = iris)
pt <- party:::prettytree(cf@ensemble[[1]], names(cf@data@get("input")))
pt
nt <- new("BinaryTree")
nt@tree <- pt
nt@data <- cf@data
nt@responses <- cf@responses
nt
plot(nt)
install.packages("tree")
library(tree)
tr <- tree(Species ~ ., data=iris)
tr
3 覲 譴 Gini impurity #
Gini impurity
讌 讌(Gini Index) 覿(impurity)襯 豸′ 讌企. 螳豌願 覈覲 i覯讌 覯譯朱覿 豢豢螻, 蠏 螳豌企ゼ 覈覲 j覯讌 覯譯殊 り る襯(misclassification) 襯 P(i)P(j)螳 . 蠍一 P(i) 螳 襷 螳豌願 覈覲 I覯讌 覯譯殊 襯企. 企 る襯 襯 覈
覲 譴
imp <- data.frame(importance(model))
imp[order(imp$MeanDecreaseGini, decreasing=T),]
varImpPlot(model)
蟆郁骸 蟯 譴 讌 覿 蟯 譴 2螳讌襦 plotting .
4 RRF #
Regularized Random Forest
install.packages("RRF")
library("RRF")
model <- RRF(factor(is_out) ~ ., data=training, type="classification", importance=TRUE)
pred <- predict(model, newdata=test2)
confusionMatrix(pred, test2$is_out)