前面說明了SequenceFile,而在這裡要介紹MapFile,可以說是索引版及排序後的SequenceFile,MapFile由兩個部分組成,分別是data與index,index就是存放索引的文件,當透過Mapfile來訪問文件時,index文件將被載入到內存,藉由索引快速定位到指定Record所在位置,因此提高了檢索效率。
一、API的部分基本上都與SequenceFil相同,只是從SequenceFile類轉換成MapFile類,另外讀取的部分雖可用相同的方式讀取,但MapFile可以藉由索引,直接對指定的key作抓取的動作,而不需要再循序。
二、實際撰寫API
(一)首先一樣寫一個writeToMap的方法
public void writeToMap(String srcPath, MapFile.Writer writer, Text writableKey, BytesWritable writableValue){
InputStream in = null;
try {
in = new BufferedInputStream(new FileInputStream(srcPath));
String fileName = srcPath.substring(srcPath.lastIndexOf("\\") + 1);
writableKey.set(fileName);
int len = 0;
byte[] buff = new byte[in.available()];
while ((len = in.read(buff))!= -1) {
writableValue.set(buff, 0, len);
writer.append(writableKey, writableValue);//將每筆資訊追加到MapFile.Writer的尾端
}
} catch (IOException e) {
e.printStackTrace();
}finally {
IOUtils.closeStream(in);
}
}
(二)撰寫uploadToMap將指定目錄下的所有檔案以打包方式上傳至HDFS
public void uploadToMap(String srcDir,String desc){
MapFile.Writer writer = null;
try {
FileSystem fileSystem = FileSystem.get(conf);
Text writableKey = new Text();
BytesWritable writableValue = new BytesWritable();
writer = new MapFile.Writer(conf,fileSystem,desc,writableKey.getClass(),writableValue.getClass());
File folder = new File(srcDir);
String[] list = folder.list();
for (int i = 0; i < list.length; i++) {
String filePath = srcDir + "\\" + list[i];
writeToMap(filePath, writer, writableKey, writableValue);
}
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}finally {
IOUtils.closeStream(writer);
}
}
(三)撰寫downloadFromMap進行下載
public void downloadFromMap(String srcDir,String desc,String fileName){
MapFile.Reader reader = null;
try {
FileSystem fileSystem = FileSystem.get(conf);
OutputStream out = new BufferedOutputStream(new FileOutputStream(desc));
Text writableKey = new Text(fileName);
BytesWritable writableValue = new BytesWritable();
reader = new MapFile.Reader(new Path(srcDir),conf);
reader.get(writableKey,writableValue);//使用reader.get直接跳到該位置
out.write(writableValue.getBytes(),0,writableValue.getLength());
out.flush();
} catch (IOException e) {
e.printStackTrace();
}finally {
IOUtils.closeStream(reader);
}
}
(四)最後實際使用上傳及下載的方法
public static void main(String[] args) {
final String HDFS_PATH = "hdfs://192.168.121.130:9000";
MapFileOperation mapFile = new MapFileOperation(HDFS_PATH);
mapFile.uploadToMap("C:\\Users\\will\\Downloads\\myTestFiles","/testFile/test.map");
mapFile.downloadFromMap("/testFile/test.map","C:\\Users\\will\\Downloads\\7.pdf","7.pdf");
}
留言
張貼留言