달력

1

« 2020/1 »

  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  •  


 

Hadoop - install for windows

 

 

 

하둡 설치파일 압축해제

압축 프로그램을 관리자 권한으로 실행합니다.

[Run as administrator]


 

 

다운받은 압축파일을 압축해제합니다.


 

 

[풀기]


 

 

CodeLab 폴더 선택 >> [확인]


 


 

 


 

 

 

 

 

 

환경변수 추가하기

 

환경변수 새로 만들기

변수 이름

HADOOP_HOME

변수

C:\CodeLab\hadoop-3.1.3

 

 


 

PATH 경로 추가하기

변수 이름

PATH

변수

%HADOOP_HOME%\bin

 

 


 

 

 

 

 

 

정상 설치 확인하기

 

hadoop -version

C:\CodeLab>hadoop -version

java version "1.8.0_65"

Java(TM) SE Runtime Environment (build 1.8.0_65-b17)

Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)

 

C:\CodeLab>

 

 

 

 

 

 

 

 

HDFS configurations

 

hadoop-env.cmd

hadoop-env.cmd 오픈합니다.

C:\CodeLab\hadoop-3.1.3\etc\hadoop

 


 


 

4개의 라인을 가장 아래에 추가합니다.

set HADOOP_PREFIX=%HADOOP_HOME%

set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop

set YARN_CONF_DIR=%HADOOP_CONF_DIR%

set PATH=%PATH%;%HADOOP_PREFIX%\bin


 

 

 

 

core-site.xml

core-site.xml 오픈합니다.

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop


 

 


 

property 태그를 추가해 줍니다.

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://0.0.0.0:19000</value>

</property>

</configuration>

 

 

 

data 폴더 생성

C:\CodeLab\hadoop-3.1.3 아래에 data 폴더를 생성합니다.

C:\CodeLab\hadoop-3.1.3

 


 

C:\CodeLab\hadoop-3.1.3\data안에 "namenode" 폴더와 "datanode" 폴더를 생성합니다.

C:\CodeLab\hadoop-3.1.3\data

 


 

 

 

 

hdfs-site.xml

hdfs-site.xml 오픈합니다.

C:\CodeLab\hadoop-3.1.3\etc\hadoop


 


 

 

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.name.dir</name>

<value>file:///C:/CodeLab/hadoop-3.1.3/data/namenode</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>file:///C:/CodeLab/hadoop-3.1.3/data/datanode</value>

</property>

</configuration>

 

 


namespace, logs and data files 저장됩니다.

 

 

 

YARN configurations

 

mapred-site.xml

mapred-site.xml파일을 오픈합니다.


 

 


 

 

 

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

 

<property>

<name>mapreduce.jobtracker.address</name>

<value>local</value>

</property>

</configuration>

 


 

 

 

yarn-site.xml

yarn-site.xml파일을 오픈합니다.

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop


 

 

 


 

 

<configuration>

 

<!-- Site specific YARN configuration properties -->

<property>

<name>yarn.server.resourcemanager.address</name>

<value>0.0.0.0:8020</value>

</property>

 

 

<property>

<name>yarn.server.resourcemanager.application.expiry.interval</name>

<value>60000</value>

</property>

 

 

<property>

<name>yarn.server.nodemanager.address</name>

<value>0.0.0.0:45454</value>

</property>

 

 

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

 

 

<property>

<name>yarn.server.nodemanager.remote-app-log-dir</name>

<value>/app-logs</value>

</property>

 

 

<property>

<name>yarn.nodemanager.log-dirs</name>

<value>/dep/logs/userlogs</value>

</property>

 

 

<property>

<name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name>

<value>0.0.0.0</value>

</property>

 

 

<property>

<name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name>

<value>0.0.0.0</value>

</property>

 

 

<property>

<name>yarn.log-aggregation-enable</name>

<value>true</value>

</property>

 

 

<property>

<name>yarn.log-aggregation.retain-seconds</name>

<value>-1</value>

</property>

 

 

<property>

<name>yarn.application.classpath</name>

<value>%HADOOP_CONF_DIR%,%HADOOP_COMMON_HOME%/share/hadoop/common/*,%HADOOP_COMMON_HOME%/share/hadoop/common/lib/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*</value>

</property>

 

</configuration>

 

 


 

 

 

 

Initialize environment variables

환경변수를 설정해주는 hadoop-env.cmd 파일이 있는 위치로 이동합니다.

C:\CodeLab\hadoop-3.1.3\etc\hadoop

C:\CodeLab>cd C:\CodeLab\hadoop-3.1.3\etc\hadoop

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

 

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>dir *.cmd

 Volume in drive C has no label.

 Volume Serial Number is CEC6-6B66

 

 Directory of C:\CodeLab\hadoop-3.1.3\etc\hadoop

 

10/27/2019  08:14 PM             4,154 hadoop-env.cmd

09/12/2019  01:11 PM               951 mapred-env.cmd

09/12/2019  01:06 PM             2,250 yarn-env.cmd

               3 File(s)          7,355 bytes

               0 Dir(s)  147,772,735,488 bytes free

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

 

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>hadoop-env.cmd

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

 

 

 

Format file system 설정

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>hadoop namenode -format

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

2019-10-27 20:46:46,750 WARN util.Shell: Did not find winutils.exe: {}

java.io.FileNotFoundException: Could not locate Hadoop executable: C:\CodeLab\hadoop-3.1.3\bin\winutils.exe -see https://wiki.apache.org/hadoop/WindowsProblems

        at org.apache.hadoop.util.Shell.getQualifiedBinInner(Shell.java:620)

        at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:593)

        at org.apache.hadoop.util.Shell.<clinit>(Shell.java:690)

        at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)

        at org.apache.hadoop.hdfs.server.common.HdfsServerConstants$RollingUpgradeStartupOption.getAllOptionString(HdfsServerConstants.java:127)

        at org.apache.hadoop.hdfs.server.namenode.NameNode.<clinit>(NameNode.java:324)

2019-10-27 20:46:46,996 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = CODEMASTER/xxx.xxx.xxx.xxx

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 3.1.3

STARTUP_MSG:   classpath = C:\CodeLab\hadoop-3.1.3\etc\hadoop;C:\CodeLab\hadoop-3.1.3\share\hadoop\common;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\accessors-smart-1.2.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\animal-sniffer-annotations-1.17.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\asm-5.0.4.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\audience-annotations-0.5.0.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\avro-1.7

 

... 생략

 

STARTUP_MSG:   build = https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579; compiled by 'ztang' on 2019-09-12T02:47Z

STARTUP_MSG:   java = 1.8.0_65

************************************************************/

2019-10-27 20:46:47,347 INFO namenode.NameNode: createNameNode [-format]

2019-10-27 20:46:47,779 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Formatting using clusterid: CID-2a188cb9-4f77-4c04-ba4c-1307eb61d1d5

2019-10-27 20:46:50,687 INFO namenode.FSEditLog: Edit logging is async:true

2019-10-27 20:46:51,052 INFO namenode.FSNamesystem: KeyProvider: null

2019-10-27 20:46:51,054 INFO namenode.FSNamesystem: fsLock is fair: true

2019-10-27 20:46:51,056 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false

2019-10-27 20:46:51,158 INFO namenode.FSNamesystem: fsOwner             = codedragon (auth:SIMPLE)

2019-10-27 20:46:51,159 INFO namenode.FSNamesystem: supergroup          = supergroup

2019-10-27 20:46:51,160 INFO namenode.FSNamesystem: isPermissionEnabled = true

2019-10-27 20:46:51,164 INFO namenode.FSNamesystem: HA Enabled: false

2019-10-27 20:46:51,236 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling

2019-10-27 20:46:51,311 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000

2019-10-27 20:46:51,311 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true

2019-10-27 20:46:51,322 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000

2019-10-27 20:46:51,323 INFO blockmanagement.BlockManager: The block deletion will start around 2019 Oct 27 20:46:51

2019-10-27 20:46:51,340 INFO util.GSet: Computing capacity for map BlocksMap

2019-10-27 20:46:51,340 INFO util.GSet: VM type       = 64-bit

2019-10-27 20:46:51,362 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB

2019-10-27 20:46:51,362 INFO util.GSet: capacity      = 2^21 = 2097152 entries

2019-10-27 20:46:51,385 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false

2019-10-27 20:46:51,394 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS

2019-10-27 20:46:51,394 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033

2019-10-27 20:46:51,394 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0

2019-10-27 20:46:51,395 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000

2019-10-27 20:46:51,396 INFO blockmanagement.BlockManager: defaultReplication         = 1

2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: maxReplication             = 512

2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: minReplication             = 1

2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2

2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms

2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: encryptDataTransfer        = false

2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000

2019-10-27 20:46:51,457 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215

2019-10-27 20:46:51,497 INFO util.GSet: Computing capacity for map INodeMap

2019-10-27 20:46:51,497 INFO util.GSet: VM type       = 64-bit

2019-10-27 20:46:51,498 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB

2019-10-27 20:46:51,499 INFO util.GSet: capacity      = 2^20 = 1048576 entries

2019-10-27 20:46:51,501 INFO namenode.FSDirectory: ACLs enabled? false

2019-10-27 20:46:51,501 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true

2019-10-27 20:46:51,502 INFO namenode.FSDirectory: XAttrs enabled? true

2019-10-27 20:46:51,503 INFO namenode.NameNode: Caching file names occurring more than 10 times

2019-10-27 20:46:51,512 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536

2019-10-27 20:46:51,515 INFO snapshot.SnapshotManager: SkipList is disabled

2019-10-27 20:46:51,542 INFO util.GSet: Computing capacity for map cachedBlocks

2019-10-27 20:46:51,542 INFO util.GSet: VM type       = 64-bit

2019-10-27 20:46:51,543 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB

2019-10-27 20:46:51,544 INFO util.GSet: capacity      = 2^18 = 262144 entries

2019-10-27 20:46:51,559 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10

2019-10-27 20:46:51,559 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10

2019-10-27 20:46:51,560 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25

2019-10-27 20:46:51,564 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

2019-10-27 20:46:51,564 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis

2019-10-27 20:46:51,567 INFO util.GSet: Computing capacity for map NameNodeRetryCache

2019-10-27 20:46:51,567 INFO util.GSet: VM type       = 64-bit

2019-10-27 20:46:51,567 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB

2019-10-27 20:46:51,567 INFO util.GSet: capacity      = 2^15 = 32768 entries

2019-10-27 20:46:51,639 INFO namenode.FSImage: Allocated new BlockPoolId: BP-2118333937-xxx.xxx.xxx.xxx-1572176811632

2019-10-27 20:46:51,752 INFO common.Storage: Storage directory C:\CodeLab\hadoop-3.1.3\data\namenode has been successfully formatted.

2019-10-27 20:46:51,838 INFO namenode.FSImageFormatProtobuf: Saving image file C:\CodeLab\hadoop-3.1.3\data\namenode\current\fsimage.ckpt_0000000000000000000 using no compression

2019-10-27 20:46:51,993 INFO namenode.FSImageFormatProtobuf: Image file C:\CodeLab\hadoop-3.1.3\data\namenode\current\fsimage.ckpt_0000000000000000000 of size 394 bytes saved in 0 seconds .

2019-10-27 20:46:52,190 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

2019-10-27 20:46:52,230 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown.

2019-10-27 20:46:52,230 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at CODEMASTER/xxx.xxx.xxx.xxx

************************************************************/

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

HDFS configurations 기반으로한 경로가 보여집니다. (Bold처리)

 

 

 

Start HDFS daemons

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>%HADOOP_HOME%\sbin\start-dfs.cmd

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

 

두개의 명령어 창이 오픈됩니다. namenode datanode 위한 것입니다.


 

[Allow access]


 

 

 

 

 

Start YARN daemons

 

 

 

%HADOOP_HOME%\sbin\start-all.cmd

C:\CodeLab\hadoop-3.1.3\etc\hadoop>%HADOOP_HOME%\sbin\start-all.cmd

This script is Deprecated. Instead use start-dfs.cmd and start-yarn.cmd

starting yarn daemons

 

C:\CodeLab\hadoop-3.1.3\etc\hadoop>

 

 

4개가 구동됩니다.

Hadoop Namenode

Hadoop datanode

YARN Resourc Manager

YARN Node Manager


 

 

Resource manager 오픈

YRAN 웹사이트를 통해 job status 있습니다.

http://localhost:8088


 

 

 

 

설정 파일 다운로드

hadoop-configurations.zip

 





Posted by codedragon codedragon

댓글을 달아 주세요