달력

12

« 2019/12 »

  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  •  
  •  
  •  
  •  

   

SQL:2011 SQL Framework 표준 문서 (ISO/IEC JTC 1/SC 32 N 2153)

Data Management and Interchange

http://jtc1sc32.org/doc/N2151-2200/32N2153T-text_for_ballot-FDIS_9075-1.pdf

   

   

직접 다운받기

32N2153T-text_for_ballot-FDIS_9075-1.pdf


Posted by codedragon codedragon

댓글을 달아 주세요

   

   

   

Tableau(태블로)

  • 데이터베이스 테이블을 시각적 보여주는 데이터베이스 정보 시각화 도구
  • 스탠포드 대학 교수인 Pat Hanrahan(팻 하나한)의 R&D 프로젝트에서 탄생
  • 팻 하나한과 크리스 스톨트(Chris Stolte)의 주도로 비주얼 쿼리 언어인 VizQLTM 개발
  • VizQLTM 언어: 사용자가 데이터베이스와 상호작용하면서 그래픽/시각적인 결과를 얻을 수 있는 선언 언어
  • 직관적인 GUI 제공

   

http://www.tableau.com/

   

   

Tableau Online Product Tour

http://youtu.be/yJI3dV2FWwU

   

   

TRIAL 다운받기

홈페이지에서 2주간 체험판 제공합니다.

우측 상단의 FREE TRIAL 클릭

   

직접다운로드


TableauDesktop-64bit.zip.001


TableauDesktop-64bit.zip.002


TableauDesktop-64bit.zip.003


TableauDesktop-64bit.zip.004


TableauDesktop-64bit.zip.005


TableauDesktop-64bit.zip.006


TableauDesktop-64bit.zip.007


TableauDesktop-64bit.zip.008


TableauDesktop-64bit.zip.009


TableauDesktop-64bit.zip.010


TableauDesktop-64bit.zip.011


TableauDesktop-64bit.zip.012


TableauDesktop-64bit.zip.013


Posted by codedragon codedragon

댓글을 달아 주세요

   

   

   

Korea DB -국가 DB

  • 국가적으로 보존 및 이용가치가 높은 과학 기술, 교육, 문화, 역사, 건설등 다양한 분야 정보를 디지털화하여 오픈
  • DB카탈로그에서 오픈 데이터 확인 가능

   

http://koreadb.data.go.kr/

   

   

Posted by codedragon codedragon

댓글을 달아 주세요

   

   

   

KDATA-Linked Data for KOREA

공공데이터, 오픈데이터, LOD 관련

   

http://dakchigo.kr/index.jsp

   

   

http://kdata.kr/

   

   

LODAC 2015

http://dakchigo.kr/events/part6/index.jsp

Posted by codedragon codedragon

댓글을 달아 주세요

2015. 1. 18. 09:19

SQLite 실행 Development/Database

   

   

SQLite 실행

시작 > 실행 > cmd.exe 입력

   

sqlite3.exe 파일이 있는 디렉터리로 이동

cd C:\SecureCoding\SQLite\sqlite-shell

   

dir 로 해당 폴더안의 파일 내용 확인 (sqlite3.exe)

dir

   

sqlite3.exe 실행

   

   

Posted by codedragon codedragon

댓글을 달아 주세요

   

   

kaggle.com

  • 전세계 데이터 과학자들이 특정 문제의 해결법을 놓고 경쟁을 벌이는 온라인 플랫폼
  • 데이터 과학자들이 기계학습과 통계학을 기본으로 다양한 전략과 알고리즘을 구사하여 '시합(competition)'이라는 모델을 통해 문제 해결 방법을 찾아가게 됩니다.

   

최근 사례

  • 미국 제너럴일렉트릭(GE)은 국제선 항공기의 도착 시간을 보다 정확히 예측할 수 있는 방법을 찾고자 했습니다. 이를 위해 GE는 거액의 상금과 더불어 날씨, 비행기 위치, 비행 시간, 연료 소비량 등을 담은 방대한 양의 (빅)데이터를 캐글(kaggle)에 제공.(문제를 내는 쪽은 빅데이터를 보유하고 있으나 이를 분석할 전문가는 부족한 기업 또는 기관이 대부분)
  • 캐글은 이를 온라인에 공개했고,
  • 이를 본 세계 곳곳의 데이터 과학자(data scientist)들이 각자 혹은 팀을 이뤄 문제 해결에 나섰고 지난해 말 1단계 우승자들이 정해졌습니다.
  • 이들이 개발한 알고리즘을 적용하면 현행보다 비행기 도착 시간을 49% 더 정확하게 예측할 수 있다고 합니다.
  • 우승자들은 25만 달러(2억7,212만5,000 원)의 상금과 '선수 중의 선수'라는 명예, 세계 유수 기업의 스카우트 대상이 되는 기쁨을 누렸습니다.

   

   

데이터 과학자(data scientist)

계산 자체는 컴퓨터가 하지만 분석을 위해 예측 모델을 도출하는 알고리즘을 만드는 것은 인간 몫이며 이런 부류의 일을 하는 사람을 말합니다.

   

   

Kaggle site

http://www.kaggle.com/

   

   

Kaggle: How it Works

http://youtu.be/PoD84TVdD-4


   

   

[세상 바꾸는 체인지 메이커] 데이터 과학자들의 '링' 마련 … 최적 해법 찾는 길 창조 - 캐글 창업자 앤서니 골드블룸

http://sunday.joins.com/article/view.asp?aid=33816

   

Posted by codedragon codedragon

댓글을 달아 주세요

   

   

자격증 명칭

DAP

Data Architecture Professional

국가공인 DA전문가

DAsP

Data Architecture Semi-Professional

DA 준전문가

SQLP

SQL Professional

국가 공인 SQL전문가

SQLD

SQL Developer

국가 공인 SQL개발자

ADP

Advanced Data Analytics Professional

데이터 분석 전문가

ADsP

Advanced Data Analytics Semi-Professional

데이터 분석 준전문가

   

   

   

시험 정보 및 시험 접수 (DBGuide.net)

http://www.dbguide.net/da.db

   

   

   

DAP, DASP 자격검증 커뮤니티 카페

http://cafe.naver.com/onlydap

   

   

   

SQL 자격검증 전문가 포럼 카페

http://cafe.naver.com/sqlpd

   

Posted by codedragon codedragon

댓글을 달아 주세요

 

An Introduction to Statistical Learning site

http://www-bcf.usc.edu/~gareth/ISL/

   

 

 

Introduction to Statistical Learning-Youtube

http://youtu.be/St2-97n7atk

 



An Introduction to Statistical Learning

 

 

목차

Preface vii

1 Introduction 1

2 Statistical Learning 15

2.1 What Is Statistical Learning? . . . . . . . . . . . . . . . . . 15

2.1.1 Why Estimate f? . . . . . . . . . . . . . . . . . . . . 17

2.1.2 How Do We Estimate f? . . . . . . . . . . . . . . . 21

2.1.3 The Trade-Off Between Prediction Accuracy

and Model Interpretability . . . . . . . . . . . . . . 24

2.1.4 Supervised Versus Unsupervised Learning . . . . . . 26

2.1.5 Regression Versus Classification Problems . . . . . . 28

2.2 AssessingModel Accuracy . . . . . . . . . . . . . . . . . . . 29

2.2.1 Measuring the Quality of Fit . . . . . . . . . . . . . 29

2.2.2 The Bias-VarianceTrade-Off . . . . . . . . . . . . . 33

2.2.3 The Classification Setting . . . . . . . . . . . . . . . 37

2.3 Lab: Introduction to R . . . . . . . . . . . . . . . . . . . . . 42

2.3.1 Basic Commands . . . . . . . . . . . . . . . . . . . . 42

2.3.2 Graphics . . . . . . . . . . . . . . . . . . . . . . . . 45

2.3.3 Indexing Data . . . . . . . . . . . . . . . . . . . . . 47

2.3.4 Loading Data . . . . . . . . . . . . . . . . . . . . . . 48

2.3.5 Additional Graphical and Numerical Summaries . . 49

2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3 Linear Regression 59

3.1 Simple Linear Regression . . . . . . . . . . . . . . . . . . . 61

3.1.1 Estimating the Coefficients . . . . . . . . . . . . . . 61

3.1.2 Assessing the Accuracy of the Coefficient

Estimates . . . . . . . . . . . . . . . . . . . . . . . . 63

3.1.3 Assessing the Accuracy of theModel . . . . . . . . . 68

3.2 Multiple Linear Regression . . . . . . . . . . . . . . . . . . 71

3.2.1 Estimating the Regression Coefficients . . . . . . . . 72

3.2.2 Some Important Questions . . . . . . . . . . . . . . 75

3.3 Other Considerations in the Regression Model . . . . . . . . 82

3.3.1 Qualitative Predictors . . . . . . . . . . . . . . . . . 82

3.3.2 Extensions of the LinearModel . . . . . . . . . . . . 86

3.3.3 Potential Problems . . . . . . . . . . . . . . . . . . . 92

3.4 TheMarketing Plan . . . . . . . . . . . . . . . . . . . . . . 102

3.5 Comparison of Linear Regression with K-Nearest

Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

3.6 Lab: Linear Regression . . . . . . . . . . . . . . . . . . . . . 109

3.6.1 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . 109

3.6.2 Simple Linear Regression . . . . . . . . . . . . . . . 110

3.6.3 Multiple Linear Regression . . . . . . . . . . . . . . 113

3.6.4 Interaction Terms . . . . . . . . . . . . . . . . . . . 115

3.6.5 Non-linear Transformations of the Predictors . . . . 115

3.6.6 Qualitative Predictors . . . . . . . . . . . . . . . . . 117

3.6.7 Writing Functions . . . . . . . . . . . . . . . . . . . 119

3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4 Classification 127

4.1 An Overview of Classification . . . . . . . . . . . . . . . . . 128

4.2 Why Not Linear Regression? . . . . . . . . . . . . . . . . . 129

4.3 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . 130

4.3.1 The LogisticModel . . . . . . . . . . . . . . . . . . . 131

4.3.2 Estimating the Regression Coefficients . . . . . . . . 133

4.3.3 Making Predictions . . . . . . . . . . . . . . . . . . . 134

4.3.4 Multiple Logistic Regression. . . . . . . . . . . . . . 135

4.3.5 Logistic Regression for >2 Response Classes . . . . . 137

4.4 Linear Discriminant Analysis . . . . . . . . . . . . . . . . . 138

4.4.1 Using Bayes' Theorem for Classification . . . . . . . 138

4.4.2 Linear Discriminant Analysis for p=1 . . . . . . . . 139

4.4.3 Linear Discriminant Analysis for p >1 . . . . . . . . 142

4.4.4 Quadratic Discriminant Analysis . . . . . . . . . . . 149

4.5 A Comparison of Classification Methods . . . . . . . . . . . 151

4.6 Lab: Logistic Regression, LDA, QDA, and KNN . . . . . . 154

4.6.1 The StockMarket Data . . . . . . . . . . . . . . . . 154

4.6.2 Logistic Regression . . . . . . . . . . . . . . . . . . . 156

4.6.3 Linear Discriminant Analysis . . . . . . . . . . . . . 161

4.6.4 Quadratic Discriminant Analysis . . . . . . . . . . . 163

4.6.5 K-NearestNeighbors . . . . . . . . . . . . . . . . . . 163

4.6.6 An Application to Caravan Insurance Data . . . . . 165

4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

5 Resampling Methods 175

5.1 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . 176

5.1.1 The Validation Set Approach . . . . . . . . . . . . . 176

5.1.2 Leave-One-Out Cross-Validation . . . . . . . . . . . 178

5.1.3 k-Fold Cross-Validation . . . . . . . . . . . . . . . . 181

5.1.4 Bias-Variance Trade-Off for k-Fold

Cross-Validation . . . . . . . . . . . . . . . . . . . . 183

5.1.5 Cross-Validation on Classification Problems . . . . . 184

5.2 The Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . 187

5.3 Lab: Cross-Validation and the Bootstrap . . . . . . . . . . . 190

5.3.1 The Validation Set Approach . . . . . . . . . . . . . 191

5.3.2 Leave-One-Out Cross-Validation . . . . . . . . . . . 192

5.3.3 k-Fold Cross-Validation . . . . . . . . . . . . . . . . 193

5.3.4 The Bootstrap . . . . . . . . . . . . . . . . . . . . . 194

5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

6 Linear Model Selection and Regularization 203

6.1 Subset Selection . . . . . . . . . . . . . . . . . . . . . . . . 205

6.1.1 Best Subset Selection . . . . . . . . . . . . . . . . . 205

6.1.2 Stepwise Selection . . . . . . . . . . . . . . . . . . . 207

6.1.3 Choosing the OptimalModel . . . . . . . . . . . . . 210

6.2 ShrinkageMethods . . . . . . . . . . . . . . . . . . . . . . . 214

6.2.1 Ridge Regression . . . . . . . . . . . . . . . . . . . . 215

6.2.2 The Lasso . . . . . . . . . . . . . . . . . . . . . . . . 219

6.2.3 Selecting the Tuning Parameter . . . . . . . . . . . . 227

6.3 Dimension ReductionMethods . . . . . . . . . . . . . . . . 228

6.3.1 Principal Components Regression . . . . . . . . . . . 230

6.3.2 Partial Least Squares . . . . . . . . . . . . . . . . . 237

6.4 Considerations in High Dimensions . . . . . . . . . . . . . . 238

6.4.1 High-Dimensional Data . . . . . . . . . . . . . . . . 238

6.4.2 What Goes Wrong in High Dimensions? . . . . . . . 239

6.4.3 Regression in High Dimensions . . . . . . . . . . . . 241

6.4.4 Interpreting Results in High Dimensions . . . . . . . 243

6.5 Lab 1: Subset Selection Methods . . . . . . . . . . . . . . . 244

6.5.1 Best Subset Selection . . . . . . . . . . . . . . . . . 244

6.5.2 Forward and Backward Stepwise Selection . . . . . . 247

6.5.3 Choosing Among Models Using the Validation

Set Approach and Cross-Validation . . . . . . . . . . 248

6.6 Lab 2: Ridge Regression and the Lasso . . . . . . . . . . . . 251

6.6.1 Ridge Regression . . . . . . . . . . . . . . . . . . . . 251

6.6.2 The Lasso . . . . . . . . . . . . . . . . . . . . . . . . 255

6.7 Lab 3: PCR and PLS Regression . . . . . . . . . . . . . . . 256

6.7.1 Principal Components Regression . . . . . . . . . . . 256

6.7.2 Partial Least Squares . . . . . . . . . . . . . . . . . 258

6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

7 Moving Beyond Linearity 265

7.1 PolynomialRegression . . . . . . . . . . . . . . . . . . . . . 266

7.2 Step Functions . . . . . . . . . . . . . . . . . . . . . . . . . 268

7.3 Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . 270

7.4 Regression Splines . . . . . . . . . . . . . . . . . . . . . . . 271

7.4.1 Piecewise Polynomials . . . . . . . . . . . . . . . . . 271

7.4.2 Constraints and Splines . . . . . . . . . . . . . . . . 271

7.4.3 The Spline Basis Representation . . . . . . . . . . . 273

7.4.4 Choosing the Number and Locations

of the Knots . . . . . . . . . . . . . . . . . . . . . . 274

7.4.5 Comparison to Polynomial Regression . . . . . . . . 276

7.5 Smoothing Splines . . . . . . . . . . . . . . . . . . . . . . . 277

7.5.1 An Overview of Smoothing Splines . . . . . . . . . . 277

7.5.2 Choosing the Smoothing Parameter λ . . . . . . . . 278

7.6 Local Regression . . . . . . . . . . . . . . . . . . . . . . . . 280

7.7 Generalized AdditiveModels . . . . . . . . . . . . . . . . . 282

7.7.1 GAMs for Regression Problems . . . . . . . . . . . . 283

7.7.2 GAMs for Classification Problems . . . . . . . . . . 286

7.8 Lab: Non-linearModeling . . . . . . . . . . . . . . . . . . . 287

7.8.1 Polynomial Regression and Step Functions . . . . . 288

7.8.2 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . 293

7.8.3 GAMs . . . . . . . . . . . . . . . . . . . . . . . . . . 294

7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

8 Tree-Based Methods 303

8.1 The Basics of Decision Trees . . . . . . . . . . . . . . . . . 303

8.1.1 Regression Trees . . . . . . . . . . . . . . . . . . . . 304

8.1.2 Classification Trees . . . . . . . . . . . . . . . . . . . 311

8.1.3 Trees Versus LinearModels . . . . . . . . . . . . . . 314

8.1.4 Advantages and Disadvantages of Trees . . . . . . . 315

8.2 Bagging, Random Forests, Boosting . . . . . . . . . . . . . 316

8.2.1 Bagging . . . . . . . . . . . . . . . . . . . . . . . . . 316

8.2.2 Random Forests . . . . . . . . . . . . . . . . . . . . 320

8.2.3 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . 321

8.3 Lab: Decision Trees . . . . . . . . . . . . . . . . . . . . . . . 324

8.3.1 Fitting Classification Trees . . . . . . . . . . . . . . 324

8.3.2 Fitting RegressionTrees . . . . . . . . . . . . . . . . 327

8.3.3 Bagging and Random Forests . . . . . . . . . . . . . 328

8.3.4 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . 330

8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

9 Support Vector Machines 337

9.1 MaximalMargin Classifier . . . . . . . . . . . . . . . . . . . 338

9.1.1 What Is a Hyperplane? . . . . . . . . . . . . . . . . 338

9.1.2 Classification Using a Separating Hyperplane . . . . 339

9.1.3 TheMaximalMargin Classifier . . . . . . . . . . . . 341

9.1.4 Construction of the Maximal Margin Classifier . . . 342

9.1.5 The Non-separable Case . . . . . . . . . . . . . . . . 343

9.2 Support Vector Classifiers . . . . . . . . . . . . . . . . . . . 344

9.2.1 Overview of the Support Vector Classifier . . . . . . 344

9.2.2 Details of the Support Vector Classifier . . . . . . . 345

9.3 Support Vector Machines . . . . . . . . . . . . . . . . . . . 349

9.3.1 Classification with Non-linear Decision

Boundaries . . . . . . . . . . . . . . . . . . . . . . . 349

9.3.2 The Support Vector Machine . . . . . . . . . . . . . 350

9.3.3 An Application to the Heart Disease Data . . . . . . 354

9.4 SVMs withMore than Two Classes . . . . . . . . . . . . . . 355

9.4.1 One-Versus-One Classification. . . . . . . . . . . . . 355

9.4.2 One-Versus-All Classification . . . . . . . . . . . . . 356

9.5 Relationship to Logistic Regression . . . . . . . . . . . . . . 356

9.6 Lab: Support Vector Machines . . . . . . . . . . . . . . . . 359

9.6.1 Support Vector Classifier . . . . . . . . . . . . . . . 359

9.6.2 Support Vector Machine . . . . . . . . . . . . . . . . 363

9.6.3 ROC Curves . . . . . . . . . . . . . . . . . . . . . . 365

9.6.4 SVMwithMultiple Classes . . . . . . . . . . . . . . 366

9.6.5 Application to Gene Expression Data . . . . . . . . 366

9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

10 Unsupervised Learning 373

10.1 The Challenge of Unsupervised Learning . . . . . . . . . . . 373

10.2 Principal Components Analysis . . . . . . . . . . . . . . . . 374

10.2.1 What Are Principal Components? . . . . . . . . . . 375

10.2.2 Another Interpretation of Principal Components . . 379

10.2.3 More on PCA . . . . . . . . . . . . . . . . . . . . . . 380

10.2.4 Other Uses for Principal Components . . . . . . . . 385

10.3 ClusteringMethods . . . . . . . . . . . . . . . . . . . . . . . 385

10.3.1 K-Means Clustering . . . . . . . . . . . . . . . . . . 386

10.3.2 Hierarchical Clustering . . . . . . . . . . . . . . . . . 390

10.3.3 Practical Issues in Clustering . . . . . . . . . . . . . 399

10.4 Lab 1: Principal Components Analysis . . . . . . . . . . . . 401

10.5 Lab 2: Clustering . . . . . . . . . . . . . . . . . . . . . . . . 404

10.5.1 K-Means Clustering . . . . . . . . . . . . . . . . . . 404

10.5.2 Hierarchical Clustering . . . . . . . . . . . . . . . . . 406

10.6 Lab 3: NCI60 Data Example . . . . . . . . . . . . . . . . . 407

10.6.1 PCA on the NCI60 Data . . . . . . . . . . . . . . . 408

10.6.2 Clustering the Observations of the NCI60 Data . . . 410

10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413

Index 419



직접다운로드

ISLR Fourth Printing.pdf




Posted by codedragon codedragon

댓글을 달아 주세요

SQLite 3.8.7.2 Package 파일 (묶음 파일) 

sqlite-shell.zip


   

실행 화면!!!

Posted by codedragon codedragon

댓글을 달아 주세요

   

http://www.sqlite.org/download.html

상단의 Downlaod 메뉴를 클릭 한 후 >

페이지를 Precompiled Binaries for Windows 내용의 항목이 보이도록 스크롤 합니다.

   

   

zip 다운받기

'Precompiled Binaries for Windows'에 해당하는 zip 파일들을 모두 다운로드 받습니다.

   

   

SQLite 파일 압축 해제 (설치)

sqlite-shell 폴더를 새로 만든 후 다운로드 받은 zip파일을 새로 생성한 sqlite-shell 폴더에 파일 압축 해제 시킵니다.

   

Posted by codedragon codedragon

댓글을 달아 주세요