How to Read PDF File in Java


In this example we will show how to read PDF file using PDFBox in Java.

Source Code

package com.beginner.examples;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;

public class ReadPDFByPDFBox {

	public static void main(String[] args) {
		System.out.println(getContent("E:tmpTest.pdf"));
	}
	
	public static String getContent(String path) {
		String result = "";
		PDDocument document = null;
		try {
			InputStream is = new FileInputStream(path);
			document = PDDocument.load(is);
			PDFTextStripper stripper = new PDFTextStripper();
			result = stripper.getText(document).trim();
		} catch (FileNotFoundException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		} finally {
			if (null != document) {
				try {
					document.close();
				} catch (IOException e) {
					e.printStackTrace();
				}
			}
		}
		return result.toLowerCase();
	}

}

Output:

hello world!

References

Imported packages in Java documentation:

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments