----
先觀看 hdfs 的 datanode 狀況
----
# 也可以用 web 看
http://localhost:50070/dfshealth.html#tab-datanode
# 用指令看
[hadoop@hnamenode ~]$ hdfs dfsadmin -report
Configured Capacity: 64280172384256 (58.46 TB)
Present Capacity: 64247015936000 (58.43 TB)
DFS Remaining: 64238110507008 (58.42 TB)
DFS Used: 8905428992 (8.29 GB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (17):
Name: 192.168.1.11:50010 (hdatanode11.cm.nsysu.edu.tw)
Hostname: hdatanode11.cm.nsysu.edu.tw
Decommission Status : Normal
Configured Capacity: 3998832504832 (3.64 TB)
DFS Used: 407236608 (388.37 MB)
Non DFS Used: 745283584 (710.76 MB)
DFS Remaining: 3997679984640 (3.64 TB)
DFS Used%: 0.01%
DFS Remaining%: 99.97%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Sep 26 10:37:29 CST 2015
.... skip ...
Name: 192.168.1.100:50010 (hnamenode.cm.nsysu.edu.tw)
Hostname: hnamenode.cm.nsysu.edu.tw
Decommission Status : Normal
Configured Capacity: 298852306944 (278.33 GB)
DFS Used: 2968330240 (2.76 GB)
Non DFS Used: 21231382528 (19.77 GB)
DFS Remaining: 274652594176 (255.79 GB)
DFS Used%: 0.99%
DFS Remaining%: 91.90%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Sep 26 10:37:30 CST 2015
----
更改 hadoop hdfs block size (dfs.blocksize)
----
從 http://localhost:19888/conf 看到 default 的 dfs.blocksize 值
<property>
<name>dfs.blocksize</name>
<value>134217728</value> 134217728(128MB) 預計變更為--> 67108864 (64MB)
<source>hdfs-default.xml</source>
</property>
# 它被預設再 hdfs-default.xml 檔案內,系統變數預設請看
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
# 可以用指令檢查
[hadoop@hnamenode ~]$ hdfs getconf -confKey dfs.blocksize
134217728
# 針對單一檔案檢查 block 的狀況
[hadoop@hnamenode ~]$ hdfs dfs -stat %o /home/hadoop/test_map.R
134217728
# 使用 fsck 觀看某個檔案 blocks 分佈的狀況
[hadoop@hnamenode ~]$ hdfs fsck /home/hadoop/test_map.R -blocks
Connecting to namenode via http://hnamenode:50070/fsck?ugi=hadoop&blocks=1&path=%2Fhome%2Fhadoop%2Ftest_map.R
FSCK started by hadoop (auth:SIMPLE) from /192.168.1.100 for path /home/hadoop/test_map.R at Sat Sep 26 14:59:54 CST 2015
.Status: HEALTHY
Total size: 81 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 81 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 17
Number of racks: 1
FSCK ended at Sat Sep 26 14:59:54 CST 2015 in 0 milliseconds
The filesystem under path '/home/hadoop/test_map.R' is HEALTHY
# 先停掉 hdfs and yarn
# stop-all.sh
# 修改 hdfs-site.xml 檔案,加入底下的內容把 dfs.blocksize 改變為 64MB ,系統預設為 128MB 。
<!-- by mtchang -->
<property>
<name>dfs.blocksize</name>
<value>67108864</value>
</property>
<!-- change block size to 64MB -->
# 改完後重新啟動
# start-all.sh
# 觀看修改後的 blocksize
[hadoop@hnamenode hadoop]$ hdfs getconf -confKey dfs.blocksize
67108864
# 推一個大檔案,約 242MB ,到 hdfs 上面看看。
[hadoop@hnamenode data]$ hdfs dfs -put big_number_1G.RData /home/hadoop/
# 變更為 64mb 的 block size 了
[hadoop@hnamenode data]$ hdfs dfs -stat %o /home/hadoop/big_number_1G.RData
67108864
# 但是原本已經存在的檔案 block size就沒有變動
[hadoop@hnamenode data]$ hdfs dfs -stat %o /public/data/big_num_400t1t.RData
134217728
# 檢查看看 blocks 的檔案狀況
[hadoop@hnamenode data]$ hdfs fsck /home/hadoop/big_number_1G.RData -blocks
Connecting to namenode via http://hnamenode:50070/fsck?ugi=hadoop&blocks=1&path=%2Fhome%2Fhadoop%2Fbig_number_1G.RData
FSCK started by hadoop (auth:SIMPLE) from /192.168.1.100 for path /home/hadoop/big_number_1G.RData at Sat Sep 26 15:14:07 CST 2015
.Status: HEALTHY
Total size: 307474871 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 5 (avg. block size 61494974 B)
Minimally replicated blocks: 5 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 16
Number of racks: 1
FSCK ended at Sat Sep 26 15:14:07 CST 2015 in 1 milliseconds
The filesystem under path '/home/hadoop/big_number_1G.RData' is HEALTHY
# 檢查新上傳的檔案 block size
[hadoop@hnamenode data]$ hdfs dfs -stat %o /home/hadoop/big_number_1G.RData
67108864
指令:
http://hadoop.apache.org/docs/r1.2.1/commands_manual.html#Generic+Options
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
解釋:
http://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html#Rebalancer
https://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom
沒有留言:
張貼留言