1. Overview

In this article, we will discuss I/O operation with HDFS from a java program. Hadoop provides mainly two classes FSDataInputStream for reading a file from HDFS and FSDataOutputStream for writing a file to HDFS.

2. Development Environment

Hadoop: 3.1.1

Java: Oracle JDK 1.8

IDE: IntelliJ Idea 2018.3

3. Initialize Configuration

First step in communication with HDFS is to initialize  Configuration class and set fs.defaultFS property. Refer below code snippet.

Configuration configuration = new Configuration();
configuration.set("fs.defaultFS", "hdfs://localhost:9000");

4. Create Directory in HDFS

Hadoop FileSystem class provide all the admin related functionality like create file or directory, delete file etc. mkDirs method is used to create a directory under HDFS.

4.1 Example

public static void createDirectory() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        FileSystem fileSystem = FileSystem.get(configuration);
        String directoryName = "javadeveloperzone/javareadwriteexample";
        Path path = new Path(directoryName);
        fileSystem.mkdirs(path);
    }

4.2 Output

Go to HDFS web view and everything is running fine you will see a directory javareadwriteexample under /user/javadeveloperzone path.

Java Read Write HDFS Example 1

5. Write File to HDFS

FSDataOutputStream class used to write data to HDFS file. It also provides various methods like writeUTF, writeInt, WriteChar etc..Here we have wrapped FSDataOutputStream to BufferedWrite class.

public static void writeFileToHDFS() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        FileSystem fileSystem = FileSystem.get(configuration);
        //Create a path
        String fileName = "read_write_hdfs_example.txt";
        Path hdfsWritePath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName);
        FSDataOutputStream fsDataOutputStream = fileSystem.create(hdfsWritePath,true);

        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8));
        bufferedWriter.write("Java API to write data in HDFS");
        bufferedWriter.newLine();
        bufferedWriter.close();
        fileSystem.close();
    }

6. Append Data to File

FileSystem class append method is used to append data to an existing file.

public static void appendToHDFSFile() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        FileSystem fileSystem = FileSystem.get(configuration);
        //Create a path
        String fileName = "read_write_hdfs_example.txt";
        Path hdfsWritePath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName);
        FSDataOutputStream fsDataOutputStream = fileSystem.append(hdfsWritePath);

        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8));
        bufferedWriter.write("Java API to append data in HDFS file");
        bufferedWriter.newLine();
        bufferedWriter.close();
        fileSystem.close();
    }

7. Read File From HDFS

FSDataInputStream class provide facility to read a file from HDFS.

7.1 Example

public static void readFileFromHDFS() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        FileSystem fileSystem = FileSystem.get(configuration);
        //Create a path
        String fileName = "read_write_hdfs_example.txt";
        Path hdfsReadPath = new Path("/user/javadeveloperzone/javareadwriteexample/" + fileName);
        //Init input stream
        FSDataInputStream inputStream = fileSystem.open(hdfsReadPath);
        //Classical input stream usage
        String out= IOUtils.toString(inputStream, "UTF-8");
        System.out.println(out);

        /*BufferedReader bufferedReader = new BufferedReader(
                new InputStreamReader(inputStream, StandardCharsets.UTF_8));

        String line = null;
        while ((line=bufferedReader.readLine())!=null){
            System.out.println(line);
        }*/

        inputStream.close();
        fileSystem.close();
    }

7.2 Output

Java API to write data in HDFS
Java API to append data in HDFS file

8. Conclusion

In this article, we have discussed how to create a directory in HDFS. Read file from HDFS and Write file to HDFS, append to an existing file with an example. FSDataInputStream and FSDataOutputStream will provide all the methods to achieve our goals.

9. References

10. Source Code

Read Write HDFS Example

You can also check our Git repository for Java Read & Write files in HDFS Example and other useful examples.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *