#title DataExplorer Package [[TableOfContents]] --출처: https://www.r-bloggers.com/how-to-automate-eda-with-dataexplorer-in-r/ ==== 설치 ==== {{{ install.packages("DataExplorer") library(DataExplorer) }}} ==== 테스트 데이터 만들기 ==== {{{ mydata <- iris mydata[5:50, 3] <- NA mydata$Species <- as.factor(mydata$Species) head(mydata) }}} ==== 연속형 변수, 이산형 변수, Missing Value ==== {{{ plot_intro(mydata, title = "Automated EDA with Data Explorer") }}} attachment:DataExplorerPackage/1.png ==== Missing Value ==== {{{ plot_missing(mydata) }}} attachment:DataExplorerPackage/2.png ==== Histogram ==== {{{ plot_histogram(mydata) }}} attachment:DataExplorerPackage/3.png ==== Density ==== {{{ plot_density(mydata) }}} attachment:DataExplorerPackage/4.png ==== BoxPlot ==== {{{ plot_boxplot(mydata, by= 'Species', ncol = 2) }}} attachment:DataExplorerPackage/5.png ==== Correlation ==== {{{ plot_correlation(mydata, cor_args = list( 'use' = 'complete.obs')) }}} attachment:DataExplorerPackage/6.png {{{ plot_correlation(mydata, type = 'c', cor_args = list( 'use' = 'complete.obs')) }}} attachment:DataExplorerPackage/7.png ==== Categorical ==== {{{ plot_bar(mydata$Species, maxcat = 20) #plot_bar(mydata, maxcat = 20, parallel = TRUE) #Windows OS에서 지원 안됨 }}} attachment:DataExplorerPackage/8.png {{{ plot_bar(mydata, with = c("Sepal.Length"), maxcat = 20) #plot_bar(mydata, with = c("Sepal.Length"), maxcat = 20, parallel = TRUE) }}} attachment:DataExplorerPackage/9.png ==== Automating EDA ==== 이렇게 하면 {{{Data Profiling Report}}}가 만들어 진다. {{{ create_report(mydata) }}}