一、假設有三台機器,vi /etc/hosts分別為
172.17.0.2 namenode
172.17.0.3 secondarynamenode
172.17.0.4 slave1
二、假設有一檔案186MB上傳到HDFS中的/tmp資料夾。
三、用bin/hadoop fs -ls /tmp查詢
root@namenode:/usr/local/hadoop/tmp/hdfs/namenode/current# hadoop fs -ls /tmp
Found 1 items
-rw-r--r-- 2 root supergroup 195257604 2014-12-25 01:31 /tmp/hadoop-2.6.0.tar.gz
四、使用fsck指令,主要用於檢查整個文件系統的健康狀況,可查出該檔案被分成幾個區塊,分別在幾台datanode
root@namenode:/usr/local/hadoop/tmp/hdfs/namenode/current# hadoop fsck /tmp/hadoop-2.6.0.tar.gz -files -blocks -locations
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Connecting to namenode via http://namenode:50070
FSCK started by root (auth:SIMPLE) from /172.17.0.2 for path /tmp/hadoop-2.6.0.tar.gz at Thu Dec 25 02:50:41 UTC 2014
/tmp/hadoop-2.6.0.tar.gz 195257604 bytes, 2 block(s): OK
0. BP-1142070096-172.17.0.2-1419470024422:blk_1073741825_1001 len=134217728 repl=2 [172.17.0.4:50010, 172.17.0.3:50010]
1. BP-1142070096-172.17.0.2-1419470024422:blk_1073741826_1002 len=61039876 repl=2 [172.17.0.3:50010, 172.17.0.4:50010]
Status: HEALTHY
Total size: 195257604 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 2 (avg. block size 97628802 B)
Minimally replicated blocks: 2 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Thu Dec 25 02:50:41 UTC 2014 in 2 milliseconds
The filesystem under path '/tmp/hadoop-2.6.0.tar.gz' is HEALTHY
可以得知這個檔案被分成兩個block分別是
blk_1073741825_1001大小為134217728 Byte(128MB),位於172.17.0.4跟172.17.0.3
blk_1073741825_1002大小為61039876 Byte(58MB),位於172.17.0.4跟172.17.0.4
五、我們想從slave1(172.17.0.4)裡面查看該檔案位置,所以先以ssh登入slave1
root@namenode:/usr/local/hadoop/tmp/hdfs/namenode/current# ssh root@slave1
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-32-generic x86_64)
* Documentation: https://help.ubuntu.com/
Last login: Thu Dec 25 02:45:28 2014 from namenode
root@slave1:~#
六、由於我們在設定Datanode時將檔案路徑設為:file:/usr/local/hadoop/tmp/hdfs/datanode,所以先進到此目錄
root@slave1:~# cd /usr/local/hadoop/tmp/hdfs/datanode/
root@slave1:/usr/local/hadoop/tmp/hdfs/datanode#
七、接著我們找出「blk_1073741825」這個檔案
root@slave1:/usr/local/hadoop/tmp/hdfs/datanode# find . -name "blk_1073741825*"
./current/BP-1142070096-172.17.0.2-1419470024422/current/finalized/subdir0/subdir0/blk_1073741825
./current/BP-1142070096-172.17.0.2-1419470024422/current/finalized/subdir0/subdir0/blk_1073741825_1001.meta
上述結果得知有一檔為meta另一檔案就會是我們所切割的檔案。
八、因此我們可以查看該檔案大小來加以驗證,使用du指令
※du 這個指令其實會直接到檔案系統內去搜尋所有的檔案資料。
-s:列出總量,而不列出各別目錄占用容量
-h:顯示出較易讀的容量格式(MB/GM...)
root@slave1:/usr/local/hadoop/tmp/hdfs/datanode# du -sh ./current/BP-1142070096-172.17.0.2-1419470024422/current/finalized/subdir0/subdir0/blk_1073741825
128M ./current/BP-1142070096-172.17.0.2-1419470024422/current/finalized/subdir0/subdir0/blk_1073741825
留言
張貼留言