intarsys PDF library API

de.intarsys.pdf.parser
Class COSDocumentParser

java.lang.Object
  extended by de.intarsys.pdf.parser.PDFParser
      extended by de.intarsys.pdf.parser.COSDocumentParser

public class COSDocumentParser
extends PDFParser

A parser for PDF data streams.

The parser will create a object representation of the pdf document using COS level objects.

The parser is a one pass, read everything implementation.


Field Summary
 
Fields inherited from class de.intarsys.pdf.parser.PDFParser
C_WARN_ARRAYSIZE, C_WARN_ENDOBJ_MISSING, C_WARN_ENDSTREAMCORRUPT, C_WARN_ENDSTREAMEOL, C_WARN_ILLEGALHEX, C_WARN_NAMETOLONG, C_WARN_SINGLEEOL, C_WARN_SINGLEEOL_OBJ, C_WARN_SINGLESPACE, C_WARN_SINGLESPACE_OBJ, C_WARN_STREAMEOL, C_WARN_STREAMEXTERNAL, C_WARN_STREAMLENGTH, C_WARN_STRINGTOLONG, C_WARN_UNEVENHEX, CHAR_BS, CHAR_CR, CHAR_FF, CHAR_HT, CHAR_LF, TOKEN_endobj, TOKEN_endstream, TOKEN_EOF, TOKEN_false, TOKEN_FDFHEADER, TOKEN_ndstream, TOKEN_null, TOKEN_obj, TOKEN_PDFHEADER, TOKEN_R, TOKEN_s_tream, TOKEN_startxref, TOKEN_stream, TOKEN_trailer, TOKEN_true, TOKEN_xref
 
Constructor Summary
COSDocumentParser(STDocument doc)
           
 
Method Summary
 STDocument getDoc()
           
 boolean isTokenXRefAt(IRandomAccess input, int offset)
           
 COSObject parseIndirectObject(IRandomAccess input, ISystemSecurityHandler securityHandler)
          read a pdf style object from the input. see PDF Reference v1.4, chapter 3.2.9 Indirect Objects COSIndirectObject ::= ObjNum GenNum "obj" Object "endobj"
 int parseStartXRef(IRandomAccess input)
          the startxref value.
 COSDictionary parseTrailer(IRandomAccess input)
          parse the trailer section from the current stream position. see PDF Reference v1.4, chapter 3.4.4 File Trailer DocumentTrailer ::= "trailer" COSDict "startxref" COSNumber
 int searchLastStartXRef(IRandomAccess input)
          Searches the offset to the first trailer in the last 1024 bytes of the document.
 int searchLinearized(IRandomAccess input)
          Deprecated. Don't use this anymore Returns the offset of the dictionary with linearization parameters if any. Returns -1 otherwise.
 
Methods inherited from class de.intarsys.pdf.parser.PDFParser
getExceptionHandler, handleError, handleWarning, isDelimiter, isDigit, isEOL, isNumberStart, isOctalDigit, isTokenStart, isWhitespace, parseElement, parseHeader, readInteger, readSpaces, readToken, readToken, setExceptionHandler, toCOSObject
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

COSDocumentParser

public COSDocumentParser(STDocument doc)
Method Detail

isTokenXRefAt

public boolean isTokenXRefAt(IRandomAccess input,
                             int offset)
                      throws IOException
Throws:
IOException

parseIndirectObject

public COSObject parseIndirectObject(IRandomAccess input,
                                     ISystemSecurityHandler securityHandler)
                              throws IOException,
                                     COSLoadException
read a pdf style object from the input. see PDF Reference v1.4, chapter 3.2.9 Indirect Objects COSIndirectObject ::= ObjNum GenNum "obj" Object "endobj"

Returns:
The parsed object.
Throws:
IOException
COSLoadException

searchLastStartXRef

public int searchLastStartXRef(IRandomAccess input)
                        throws IOException,
                               COSLoadException
Searches the offset to the first trailer in the last 1024 bytes of the document. The search goes backwards starting with the last byte.

Returns:
the offset to the first trailer found
Throws:
IOException
COSLoadException

parseStartXRef

public int parseStartXRef(IRandomAccess input)
                   throws IOException,
                          COSLoadException
the startxref value.

Returns:
the startxref value
Throws:
IOException
COSLoadException

searchLinearized

public int searchLinearized(IRandomAccess input)
                     throws IOException,
                            COSLoadException
Deprecated. Don't use this anymore Returns the offset of the dictionary with linearization parameters if any. Returns -1 otherwise.

Parameters:
input -
Returns:
Returns the offset of the dictionary with linearization parameters if any.
Throws:
IOException
COSLoadException

parseTrailer

public COSDictionary parseTrailer(IRandomAccess input)
                           throws IOException,
                                  COSLoadException
parse the trailer section from the current stream position. see PDF Reference v1.4, chapter 3.4.4 File Trailer DocumentTrailer ::= "trailer" COSDict "startxref" COSNumber

Returns:
the trailer dictionary
Throws:
IOException
COSLoadException

getDoc

public STDocument getDoc()

intarsys PDF library API

Copyright © 2006 intarsys consulting GmbH. All Rights Reserved.