

Table of Contents
1. Overview
In this article, we will discuss I/O operation with HDFS from a java program. Hadoop provides mainly two classes FSDataInputStream
for reading a file from HDFS and FSDataOutputStream
for writing a file to HDFS.
2. Development Environment
Hadoop: 3.1.1
Java: Oracle JDK 1.8
IDE: IntelliJ Idea 2018.3
3. Initialize Configuration
First step in communication with HDFS is to initialize Configuration class and set fs.defaultFS property. Refer below code snippet.
Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://localhost:9000");
4. Create Directory in HDFS
Hadoop FileSystem class provide all the admin related functionality like create file or directory, delete file etc. mkDirs
method is used to create a directory under HDFS
.
4.1 Example
public static void createDirectory() throws IOException { Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://localhost:9000"); FileSystem fileSystem = FileSystem.get(configuration); String directoryName = "javadeveloperzone/javareadwriteexample"; Path path = new Path(directoryName); fileSystem.mkdirs(path); }
4.2 Output
Go to HDFS web view and everything is running fine you will see a directory javareadwriteexample
under /user/javadeveloperzone
path.
5. Write File to HDFS
FSDataOutputStream class used to write data to HDFS file. It also provides various methods like writeUTF, writeInt, WriteChar etc..Here we have wrapped FSDataOutputStream to BufferedWrite class.
public static void writeFileToHDFS() throws IOException { Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://localhost:9000"); FileSystem fileSystem = FileSystem.get(configuration); //Create a path String fileName = "read_write_hdfs_example.txt"; Path hdfsWritePath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName); FSDataOutputStream fsDataOutputStream = fileSystem.create(hdfsWritePath,true); BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8)); bufferedWriter.write("Java API to write data in HDFS"); bufferedWriter.newLine(); bufferedWriter.close(); fileSystem.close(); }
6. Append Data to File
FileSystem class append method is used to append data to an existing file.
public static void appendToHDFSFile() throws IOException { Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://localhost:9000"); FileSystem fileSystem = FileSystem.get(configuration); //Create a path String fileName = "read_write_hdfs_example.txt"; Path hdfsWritePath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName); FSDataOutputStream fsDataOutputStream = fileSystem.append(hdfsWritePath); BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8)); bufferedWriter.write("Java API to append data in HDFS file"); bufferedWriter.newLine(); bufferedWriter.close(); fileSystem.close(); }
7. Read File From HDFS
FSDataInputStream class provide facility to read a file from HDFS.
7.1 Example
public static void readFileFromHDFS() throws IOException { Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://localhost:9000"); FileSystem fileSystem = FileSystem.get(configuration); //Create a path String fileName = "read_write_hdfs_example.txt"; Path hdfsReadPath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName); //Init input stream FSDataInputStream inputStream = fileSystem.open(hdfsReadPath); //Classical input stream usage String out= IOUtils.toString(inputStream, "UTF-8"); System.out.println(out); /*BufferedReader bufferedReader = new BufferedReader( new InputStreamReader(inputStream, StandardCharsets.UTF_8)); String line = null; while ((line=bufferedReader.readLine())!=null){ System.out.println(line); }*/ inputStream.close(); fileSystem.close(); }
7.2 Output
Java API to write data in HDFS Java API to append data in HDFS file
8. Conclusion
In this article, we have discussed how to create a directory in HDFS. Read file from HDFS and Write file to HDFS, append to an existing file with an example. FSDataInputStream and FSDataOutputStream will provide all the methods to achieve our goals.
9. References
- Hadoop-Download
- MultipleOutputsExample
- Custom-Value-WritableExample
- Custom-Key-WritableExample
- Hadoop Blogs
10. Source Code
You can also check our Git repository for Java Read & Write files in HDFS Example and other useful examples.
1 comment. Leave new
Still, I have faced “ava.lang.ClassCastException: org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos” in hadoop 3.2.1.
Please let me know, what types of jars do you referred in mkdir sample?