Home / Products / PDFTextStream Features

PDFTextStream Features

PDFTextStream was built from the ground up specifically to meet the most stringent PDF text and metadata content extraction requirements. Its API is comprehensive, and includes the following features:

  • Extensive support for the PDF file format specification and all known variants Learn More

  • Full Unicode-capable text extraction facilities, including support for extracting Chinese, Japanese, and Korean text, in both horizontal and vertical writing modes

  • Full support for updating interactive AcroForms (including text, checkbox, radio button, and choice fields) api doc

  • Comprehensive PDF document metadata access

    • Simple key/value attributes api doc

    • Adobe XMP - XML metadata access api doc

Looking for More Information?

It's easy to find out why you should use PDFTextStream. Other parts of our site are dedicated to PDFTextStream's comprehensive PDF file format support and its unbeatable performance.

  • Page-level object model via com.snowtide.pdf.Page (api doc), providing page-specific text extraction (api doc) and page metrics (height, width, rotation angle, etc)

  • Acroform (interactive form) data extraction api doc

  • PDF bookmark (document outline) access api doc

  • PDF annotation access (including Link (web URL) annotations) api doc

  • Seamless Lucene integration article api doc

  • EncryptionInfo API: provides access to PDF document encryption parameters api doc

  • Text-piping API for super-fast text extraction api doc provides hooks for customizing how PDF text extracts are formatted (such as when the visual layout of each page needs to be maintained)

  • Selective regional text extraction built-in, ideal for extracting data from fixed-format forms api doc

  • Optional in-memory operation api doc

  • Built-in PDF merge utility api doc

  • PDF to HTML exporter api doc

  • PDFTextStream subclasses java.io.Reader, which ensures a simple, familiar interface, and straightforward integration opportunities with existing components that expect a java.io.Reader instance.

  • Flexible logging toolkit hooks

    • Built-in support for logging to standard out, Log4J, and java.util.logging toolkits

    • Ability to plug in custom logging implementations api doc

Put PDFTextStream to the test >>

Download PDFTextStream Now >>

What's your PDF problem?