CODEDRAGON ㆍDevelopment/Java
Hadoop - install for windows
- 하둡 설치파일 압축해제
- 환경변수 추가하기
- 정상 설치 확인하기
- HDFS configurations
- YARN configurations
- Initialize environment variables
- Format file system 설정
- Start HDFS daemons
- Start YARN daemons
- 설정 파일 다운로드
하둡 설치파일 압축해제
압축 프로그램을 관리자 권한으로 실행합니다.
[Run as administrator]
다운받은 압축파일을 압축해제합니다.
[풀기]
CodeLab 폴더 선택 >> [확인]
환경변수 추가하기
환경변수 새로 만들기
변수 이름 | HADOOP_HOME |
변수 값 | C:\CodeLab\hadoop-3.1.3 |
PATH에 경로 추가하기
변수 이름 | PATH |
변수 값 | %HADOOP_HOME%\bin |
정상 설치 확인하기
hadoop -version
C:\CodeLab>hadoop -version java version "1.8.0_65" Java(TM) SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode) C:\CodeLab> |
HDFS configurations
hadoop-env.cmd
hadoop-env.cmd을 오픈합니다.
C:\CodeLab\hadoop-3.1.3\etc\hadoop
4개의 라인을 가장 아래에 추가합니다.
set HADOOP_PREFIX=%HADOOP_HOME% set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop set YARN_CONF_DIR=%HADOOP_CONF_DIR% set PATH=%PATH%;%HADOOP_PREFIX%\bin |
core-site.xml
core-site.xml을 오픈합니다.
C:\CodeLab\hadoop-3.1.3\etc\hadoop
property 태그를 추가해 줍니다.
<configuration> <property> <name>fs.default.name</name> <value>hdfs://0.0.0.0:19000</value> </property> </configuration> |
data 폴더 생성
C:\CodeLab\hadoop-3.1.3 아래에 data 폴더를 생성합니다.
C:\CodeLab\hadoop-3.1.3
C:\CodeLab\hadoop-3.1.3\data안에 "namenode" 폴더와 "datanode" 폴더를 생성합니다.
C:\CodeLab\hadoop-3.1.3\data
hdfs-site.xml
hdfs-site.xml을 오픈합니다.
C:\CodeLab\hadoop-3.1.3\etc\hadoop
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>file:///C:/CodeLab/hadoop-3.1.3/data/namenode</value> </property> <property> <name>dfs.data.dir</name> <value>file:///C:/CodeLab/hadoop-3.1.3/data/datanode</value> </property> </configuration> |
namespace, logs and data files 이 저장됩니다.
YARN configurations
mapred-site.xml
mapred-site.xml파일을 오픈합니다.
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobtracker.address</name> <value>local</value> </property> </configuration> |
yarn-site.xml
yarn-site.xml파일을 오픈합니다.
C:\CodeLab\hadoop-3.1.3\etc\hadoop
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.server.resourcemanager.address</name> <value>0.0.0.0:8020</value> </property> <property> <name>yarn.server.resourcemanager.application.expiry.interval</name> <value>60000</value> </property> <property> <name>yarn.server.nodemanager.address</name> <value>0.0.0.0:45454</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.server.nodemanager.remote-app-log-dir</name> <value>/app-logs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/dep/logs/userlogs</value> </property> <property> <name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <name>yarn.application.classpath</name> <value>%HADOOP_CONF_DIR%,%HADOOP_COMMON_HOME%/share/hadoop/common/*,%HADOOP_COMMON_HOME%/share/hadoop/common/lib/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*</value> </property> </configuration> |
Initialize environment variables
환경변수를 설정해주는 hadoop-env.cmd 파일이 있는 위치로 이동합니다.
C:\CodeLab\hadoop-3.1.3\etc\hadoop
C:\CodeLab>cd C:\CodeLab\hadoop-3.1.3\etc\hadoop C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
C:\CodeLab\hadoop-3.1.3\etc\hadoop>dir *.cmd Volume in drive C has no label. Volume Serial Number is CEC6-6B66 Directory of C:\CodeLab\hadoop-3.1.3\etc\hadoop 10/27/2019 08:14 PM 4,154 hadoop-env.cmd 09/12/2019 01:11 PM 951 mapred-env.cmd 09/12/2019 01:06 PM 2,250 yarn-env.cmd 3 File(s) 7,355 bytes 0 Dir(s) 147,772,735,488 bytes free C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
C:\CodeLab\hadoop-3.1.3\etc\hadoop>hadoop-env.cmd C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
Format file system 설정
C:\CodeLab\hadoop-3.1.3\etc\hadoop>hadoop namenode -format DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 2019-10-27 20:46:46,750 WARN util.Shell: Did not find winutils.exe: {} java.io.FileNotFoundException: Could not locate Hadoop executable: C:\CodeLab\hadoop-3.1.3\bin\winutils.exe -see https://wiki.apache.org/hadoop/WindowsProblems at org.apache.hadoop.util.Shell.getQualifiedBinInner(Shell.java:620) at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:593) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:690) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78) at org.apache.hadoop.hdfs.server.common.HdfsServerConstants$RollingUpgradeStartupOption.getAllOptionString(HdfsServerConstants.java:127) at org.apache.hadoop.hdfs.server.namenode.NameNode.<clinit>(NameNode.java:324) 2019-10-27 20:46:46,996 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = CODEMASTER/xxx.xxx.xxx.xxx STARTUP_MSG: args = [-format] STARTUP_MSG: version = 3.1.3 STARTUP_MSG: classpath = C:\CodeLab\hadoop-3.1.3\etc\hadoop;C:\CodeLab\hadoop-3.1.3\share\hadoop\common;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\accessors-smart-1.2.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\animal-sniffer-annotations-1.17.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\asm-5.0.4.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\audience-annotations-0.5.0.jar;C:\CodeLab\hadoop-3.1.3\share\hadoop\common\lib\avro-1.7 ... 생략 STARTUP_MSG: build = https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579; compiled by 'ztang' on 2019-09-12T02:47Z STARTUP_MSG: java = 1.8.0_65 ************************************************************/ 2019-10-27 20:46:47,347 INFO namenode.NameNode: createNameNode [-format] 2019-10-27 20:46:47,779 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Formatting using clusterid: CID-2a188cb9-4f77-4c04-ba4c-1307eb61d1d5 2019-10-27 20:46:50,687 INFO namenode.FSEditLog: Edit logging is async:true 2019-10-27 20:46:51,052 INFO namenode.FSNamesystem: KeyProvider: null 2019-10-27 20:46:51,054 INFO namenode.FSNamesystem: fsLock is fair: true 2019-10-27 20:46:51,056 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false 2019-10-27 20:46:51,158 INFO namenode.FSNamesystem: fsOwner = codedragon (auth:SIMPLE) 2019-10-27 20:46:51,159 INFO namenode.FSNamesystem: supergroup = supergroup 2019-10-27 20:46:51,160 INFO namenode.FSNamesystem: isPermissionEnabled = true 2019-10-27 20:46:51,164 INFO namenode.FSNamesystem: HA Enabled: false 2019-10-27 20:46:51,236 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling 2019-10-27 20:46:51,311 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000 2019-10-27 20:46:51,311 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 2019-10-27 20:46:51,322 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2019-10-27 20:46:51,323 INFO blockmanagement.BlockManager: The block deletion will start around 2019 Oct 27 20:46:51 2019-10-27 20:46:51,340 INFO util.GSet: Computing capacity for map BlocksMap 2019-10-27 20:46:51,340 INFO util.GSet: VM type = 64-bit 2019-10-27 20:46:51,362 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB 2019-10-27 20:46:51,362 INFO util.GSet: capacity = 2^21 = 2097152 entries 2019-10-27 20:46:51,385 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false 2019-10-27 20:46:51,394 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS 2019-10-27 20:46:51,394 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 2019-10-27 20:46:51,394 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0 2019-10-27 20:46:51,395 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000 2019-10-27 20:46:51,396 INFO blockmanagement.BlockManager: defaultReplication = 1 2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: maxReplication = 512 2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: minReplication = 1 2019-10-27 20:46:51,397 INFO blockmanagement.BlockManager: maxReplicationStreams = 2 2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: redundancyRecheckInterval = 3000ms 2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: encryptDataTransfer = false 2019-10-27 20:46:51,398 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000 2019-10-27 20:46:51,457 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215 2019-10-27 20:46:51,497 INFO util.GSet: Computing capacity for map INodeMap 2019-10-27 20:46:51,497 INFO util.GSet: VM type = 64-bit 2019-10-27 20:46:51,498 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB 2019-10-27 20:46:51,499 INFO util.GSet: capacity = 2^20 = 1048576 entries 2019-10-27 20:46:51,501 INFO namenode.FSDirectory: ACLs enabled? false 2019-10-27 20:46:51,501 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true 2019-10-27 20:46:51,502 INFO namenode.FSDirectory: XAttrs enabled? true 2019-10-27 20:46:51,503 INFO namenode.NameNode: Caching file names occurring more than 10 times 2019-10-27 20:46:51,512 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536 2019-10-27 20:46:51,515 INFO snapshot.SnapshotManager: SkipList is disabled 2019-10-27 20:46:51,542 INFO util.GSet: Computing capacity for map cachedBlocks 2019-10-27 20:46:51,542 INFO util.GSet: VM type = 64-bit 2019-10-27 20:46:51,543 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB 2019-10-27 20:46:51,544 INFO util.GSet: capacity = 2^18 = 262144 entries 2019-10-27 20:46:51,559 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10 2019-10-27 20:46:51,559 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10 2019-10-27 20:46:51,560 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25 2019-10-27 20:46:51,564 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 2019-10-27 20:46:51,564 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 2019-10-27 20:46:51,567 INFO util.GSet: Computing capacity for map NameNodeRetryCache 2019-10-27 20:46:51,567 INFO util.GSet: VM type = 64-bit 2019-10-27 20:46:51,567 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB 2019-10-27 20:46:51,567 INFO util.GSet: capacity = 2^15 = 32768 entries 2019-10-27 20:46:51,639 INFO namenode.FSImage: Allocated new BlockPoolId: BP-2118333937-xxx.xxx.xxx.xxx-1572176811632 2019-10-27 20:46:51,752 INFO common.Storage: Storage directory C:\CodeLab\hadoop-3.1.3\data\namenode has been successfully formatted. 2019-10-27 20:46:51,838 INFO namenode.FSImageFormatProtobuf: Saving image file C:\CodeLab\hadoop-3.1.3\data\namenode\current\fsimage.ckpt_0000000000000000000 using no compression 2019-10-27 20:46:51,993 INFO namenode.FSImageFormatProtobuf: Image file C:\CodeLab\hadoop-3.1.3\data\namenode\current\fsimage.ckpt_0000000000000000000 of size 394 bytes saved in 0 seconds . 2019-10-27 20:46:52,190 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 2019-10-27 20:46:52,230 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown. 2019-10-27 20:46:52,230 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at CODEMASTER/xxx.xxx.xxx.xxx ************************************************************/ C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
HDFS configurations 을 기반으로한 경로가 보여집니다. (Bold처리)
Start HDFS daemons
C:\CodeLab\hadoop-3.1.3\etc\hadoop>%HADOOP_HOME%\sbin\start-dfs.cmd C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
두개의 명령어 창이 오픈됩니다. 각 각 namenode 과 datanode를 위한 것입니다.
[Allow access]
Start YARN daemons
%HADOOP_HOME%\sbin\start-all.cmd
C:\CodeLab\hadoop-3.1.3\etc\hadoop>%HADOOP_HOME%\sbin\start-all.cmd This script is Deprecated. Instead use start-dfs.cmd and start-yarn.cmd starting yarn daemons C:\CodeLab\hadoop-3.1.3\etc\hadoop> |
4개가 구동됩니다.
Hadoop Namenode
Hadoop datanode
YARN Resourc Manager
YARN Node Manager
Resource manager 오픈
YRAN 웹사이트를 통해 job status 를 볼 수 있습니다.
설정 파일 다운로드
'Development > Java' 카테고리의 다른 글
WEKA API documentation (0) | 2019.11.10 |
---|---|
WEKA - arff Dataset (0) | 2019.11.10 |
오버라이딩(Overriding) vs 오버로딩(Overloading) (0) | 2019.11.07 |
상속(Inheritance), 상속의 목적, 클래스의 상속(Inheritance), 상속관계 용어 정리 (동일 용어) (0) | 2019.11.06 |
1차원 배열 vs 2차원 배열 (0) | 2019.11.05 |