Google

Xerces 3.1.1 API: Class BaseMarkupSerializer
Xerces 3.1.1


Class BaseMarkupSerializer

java.lang.Object
  |

Base class for a serializer supporting both DOM and SAX pretty serializing of XML/HTML/XHTML documents. Derives classes perform the method-specific serializing, this class provides the common serializing mechanisms.

The serializer must be initialized with the proper writer and output format before it can be used by calling #init. The serializer can be reused any number of times, but cannot be used concurrently by two threads.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done SAX events and using the serializer as a document handler. This also applies to derived class.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

When indenting, the serializer is capable of detecting seemingly element content, and serializing these elements indented on separate lines. An element is serialized indented when it is the first or last child of an element, or immediate following or preceding another element.

Version:
$Revision: 1.20 $ $Date: 2000/09/08 01:45:49 $


          The system identifier of the document type, if known.
          The system identifier of the document type, if known.
          True if indenting printer.
          Association between namespace URIs (keys) and prefixes (values).
          If the document has been started (header serialized), this flag is set to true so it's not started twice.
Field Summary
protected  java.lang.String
protected  java.lang.String

          The output format associated with this serializer.
protected  boolean
protected  java.util.Hashtable

          The printer used for printing text parts.
protected  boolean
 
          Protected constructor can only be used by derived class.
Constructor Summary
protected
  java.lang.String aName, java.lang.String type, java.lang.String valueDefault, java.lang.String value)
          Report an attribute type declaration. int start, int length)
          Receive notification of character data.
          Called to print the text contents in the prevailing element format. int start, int length)
          Report an XML comment anywhere in the document.
            java.lang.String model)
          Report an element type declaration.
          Report the end of a CDATA section.
          Called at the end of the document to wrap it up.
          Report the end of DTD declarations.
          Report the end of an entity.
           
          End the scope of a prefix-URI mapping.
            java.lang.String publicId, java.lang.String systemId)
          Report a parsed external entity declaration.
          Returns the suitable entity reference for this character value, or null if no such entity exists.
          Returns the namespace prefix for the specified URI. int start, int length)
          Receive notification of ignorable whitespace in element content. java.lang.String value)
          Report an internal entity declaration.
          Returns true if in the state of the document. java.lang.String publicId, java.lang.String systemId)
          Receive notification of a notation declaration event.
           
          Print a document type public or system identifier URL.
           
          Escapes a string so it may be printed as text content or attribute value. int start, int length, boolean preserveSpace, boolean unescaped)
          Called to print additional text with whitespace handling. boolean preserveSpace, boolean unescaped)
            java.lang.String code)
          Receive notification of a processing instruction.
           
          Serializes the DOM document using the previously specified writer and output format.
          Serializes the DOM document fragmnt using the previously specified writer and output format.
          Serializes the DOM element using the previously specified writer and output format.
          Called to serializee the DOM element.
          Serialize the DOM node.
          Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first.
          Receive an object for locating the origin of SAX document events.
          Specifies an output stream to which the document should be serialized.
          Specifies a writer to which the document should be serialized.
          Specifies an output format for this serializer.
          Receive notification of a skipped entity.
          Report the start of a CDATA section.
          Receive notification of the beginning of a document. java.lang.String publicId, java.lang.String systemId)
          Report the start of DTD declarations, if any.
          Report the beginning of an entity in content.
            java.lang.String uri)
          Begin the scope of a prefix-URI Namespace mapping.
            java.lang.String publicId, java.lang.String systemId, java.lang.String notationName)
          Receive notification of an unparsed entity declaration event.
Method Summary



 void
 void
protected  void
 void
 void

          Must be called by a method about to print any type of content.
 void
 void
 void
 void
 void
 void
 void
 void
java.lang.String localName, java.lang.String rawName, boolean preserveSpace)
          Enter a new element state for the specified element.
 void

          Return the state of the current element.
protected abstract  java.lang.String
protected  java.lang.String
 void
 void
protected  boolean

          Leave the current element state and return to the state of the parent element.
 void
protected  void
protected  void
protected  void
protected  void
protected  void
protected  void
 void
 boolean
 void
 void
 void
protected abstract  void
protected  void
protected  void
 void
 void
 void
 void
 void
 void
 void
 void
 void
 void
 void
 void
 void
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_started

protected boolean _started
If the document has been started (header serialized), this flag is set to true so it's not started twice.

_prefixes

protected java.util.Hashtable _prefixes
Association between namespace URIs (keys) and prefixes (values). Accumulated here prior to starting an element and placing this list in the element state.

_docTypePublicId

protected java.lang.String _docTypePublicId
The system identifier of the document type, if known.

_docTypeSystemId

protected java.lang.String _docTypeSystemId
The system identifier of the document type, if known.

_format

The output format associated with this serializer. This will never be a null reference. If no format was passed to the constructor, the default one for this document type will be used. The format object is never changed by the serializer.

_printer

The printer used for printing text parts.

_indenting

protected boolean _indenting
True if indenting printer.
Constructor Detail
BaseMarkupSerializer
Protected constructor can only be used by derived class. Must initialize the serializer before serializing any document, see #init.
Method Detail

asDocumentHandler

                                  throws java.io.IOException
interface, it should return null.

asContentHandler

                                throws java.io.IOException
interface, it should return null.

asDOMSerializer

                              throws java.io.IOException
interface, it should return null.

setOutputByteStream

public void setOutputByteStream(java.io.OutputStream output)
Specifies an output stream to which the document should be serialized. This method should not be called while the serializer is in the process of serializing a document.

setOutputCharStream

public void setOutputCharStream(java.io.Writer writer)
Specifies a writer to which the document should be serialized. This method should not be called while the serializer is in the process of serializing a document.

setOutputFormat
Specifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.
Parameters:
format - The output format to use

reset

public boolean reset()

prepare

protected void prepare()
                throws java.io.IOException

serialize

public void serialize(Element elem)
               throws java.io.IOException
Serializes the DOM element using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Parameters:
elem - The element to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serialize

public void serialize(DocumentFragment frag)
               throws java.io.IOException
Serializes the DOM document fragmnt using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Parameters:
elem - The element to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serialize

public void serialize(Document doc)
               throws java.io.IOException
Serializes the DOM document using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Parameters:
doc - The document to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

startDocument

public void startDocument()
Receive notification of the beginning of a document.

The SAX parser will invoke this method only once, before any


characters

public void characters(char[] chars,
                       int start,
                       int length)
Receive notification of character data.

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

The application must not attempt to read from the array outside of the specified range.

Note that some parsers will report whitespace in element method rather than this one (validating parsers must do so).


ignorableWhitespace

public void ignorableWhitespace(char[] chars,
                                int start,
                                int length)
Receive notification of ignorable whitespace in element content.

Validating Parsers must use this method to report each chunk of whitespace in element content (see the W3C XML 1.0 recommendation, section 2.10): non-validating parsers may also use this method if they are capable of parsing and using content models.

SAX parsers may return all contiguous whitespace in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity, so that the Locator provides useful information.

The application must not attempt to read from the array outside of the specified range.


processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String code)
Receive notification of a processing instruction.

The Parser will invoke this method once for each processing instruction found: note that processing instructions may occur before or after the main document element.

A SAX parser must never report an XML declaration (XML 1.0, section 2.8) or a text declaration (XML 1.0, section 4.3.1) using this method.

Parameters:
target - The processing instruction target.
data - The processing instruction data, or null if none was supplied. The data does not include any wrapping another exception.

comment

public void comment(char[] chars,
                    int start,
                    int length)
Report an XML comment anywhere in the document.

This callback will be used for comments inside or outside the document element, including comments in the external DTD subset (if read).


comment

public void comment(java.lang.String text)

startCDATA

public void startCDATA()
Report the start of a CDATA section.

The contents of the CDATA section will be reported through


endCDATA

public void endCDATA()
Report the end of a CDATA section.

startNonEscaping

public void startNonEscaping()

endNonEscaping

public void endNonEscaping()

startPreserving

public void startPreserving()

endPreserving

public void endPreserving()

endDocument

public void endDocument()
Called at the end of the document to wrap it up. Will flush the output stream and throw an exception if any I/O error occured while serializing.
serializing

startEntity

public void startEntity(java.lang.String name)
Report the beginning of an entity in content.

NOTE: entity references in attribute values -- and the start and end of the document entity -- are never reported.

The start and end of the external DTD subset are reported using the pseudo-name "[dtd]". All other events must be properly nested within start/end entity events.

Note that skipped entities will be reported through the event, which is part of the ContentHandler interface.

Parameters:
name - The name of the entity. If it is a parameter

endEntity

public void endEntity(java.lang.String name)
Report the end of an entity.

setDocumentLocator
Receive an object for locating the origin of SAX document events.

SAX parsers are strongly encouraged (though not absolutely required) to supply a locator: if it does so, it must supply the locator to the application by invoking this method before invoking any of the other methods in the ContentHandler interface.

The locator allows the application to determine the end position of any document-related event, even if the parser is not reporting an error. Typically, the application will use this information for reporting its own errors (such as character content that does not match an application's business rules). The information returned by the locator is probably not sufficient for use with a search engine.

Note that the locator will return correct information only during the invocation of the events in this interface. The application should not attempt to use it at any other time.

Parameters:
locator - An object that can return the location of

skippedEntity

public void skippedEntity(java.lang.String name)
Receive notification of a skipped entity.

The Parser will invoke this method once for each entity skipped. Non-validating processors may skip entities if they have not seen the declarations (because, for example, the entity was declared in an external DTD subset). All processors may skip external entities, depending on the values of the and the properties.

Parameters:
name - The name of the skipped entity. If it is a parameter entity, the name will begin with '%', and if it is the external DTD subset, it will be the string wrapping another exception.

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
Begin the scope of a prefix-URI Namespace mapping.

The information from this event is not necessary for normal Namespace processing: the SAX XML reader will automatically replace prefixes for element and attribute feature is true (the default).

There are cases, however, when applications need to use prefixes in character data or in attribute values, where they cannot safely be expanded automatically; the start/endPrefixMapping event supplies the information to the application to expand prefixes in those contexts itself, if necessary.

Note that start/endPrefixMapping events are not guaranteed to be properly nested relative to each-other: all startPrefixMapping events will occur before the guaranteed.

There should never be start/endPrefixMapping events for the


endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
End the scope of a prefix-URI mapping. details. This event will always occur after the corresponding guaranteed.


startDTD

public void startDTD(java.lang.String name,
                     java.lang.String publicId,
                     java.lang.String systemId)
Report the start of DTD declarations, if any.

Any declarations are assumed to be in the internal subset event.

Note that the start/endDTD events will appear within the start/endDocument events from ContentHandler and before the first startElement event.

Parameters:
name - The document type name.
publicId - The declared public identifier for the external DTD subset, or null if none was declared.
systemId - The declared system identifier for the

endDTD

public void endDTD()
Report the end of DTD declarations.

elementDecl

public void elementDecl(java.lang.String name,
                        java.lang.String model)
Report an element type declaration.

The content model will consist of the string "EMPTY", the string "ANY", or a parenthesised group, optionally followed by an occurrence indicator. The model will be normalized so that all whitespace is removed,and will include the enclosing parentheses.


attributeDecl

public void attributeDecl(java.lang.String eName,
                          java.lang.String aName,
                          java.lang.String type,
                          java.lang.String valueDefault,
                          java.lang.String value)
Report an attribute type declaration.

Only the effective (first) declaration for an attribute will be reported. The type will be one of the strings "CDATA", "ID", "IDREF", "IDREFS", "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION", or a parenthesized token group with the separator "|" and all whitespace removed.

Parameters:
eName - The name of the associated element.
aName - The name of the attribute.
type - A string representing the attribute type.
valueDefault - A string representing the attribute default ("#IMPLIED", "#REQUIRED", or "#FIXED") or null if none of these applies.
value - A string representing the attribute's default value,

internalEntityDecl

public void internalEntityDecl(java.lang.String name,
                               java.lang.String value)
Report an internal entity declaration.

Only the effective (first) declaration for each entity will be reported.

Parameters:
name - The name of the entity. If it is a parameter

externalEntityDecl

public void externalEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId)
Report a parsed external entity declaration.

Only the effective (first) declaration for each entity will be reported.

Parameters:
name - The name of the entity. If it is a parameter entity, the name will begin with '%'.
publicId - The declared public identifier of the entity, or

unparsedEntityDecl

public void unparsedEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId,
                               java.lang.String notationName)
Receive notification of an unparsed entity declaration event.

Note that the notation name corresponds to a notation It is up to the application to record the entity for later reference, if necessary.

If the system identifier is a URL, the parser must resolve it fully before passing it to the application.

Parameters:
name - The unparsed entity's name.
publicId - The entity's public identifier, or null if none

notationDecl

public void notationDecl(java.lang.String name,
                         java.lang.String publicId,
                         java.lang.String systemId)
Receive notification of a notation declaration event.

It is up to the application to record the notation for later reference, if necessary.

At least one of publicId and systemId must be non-null. If a system identifier is present, and it is a URL, the SAX parser must resolve it fully before passing it to the application through this event.

There is no guarantee that the notation declaration will be reported before any unparsed entities that use it.

Parameters:
name - The notation name.
publicId - The notation's public identifier, or null if none was given.
systemId - The notation's system identifier, or null if

serializeNode

protected void serializeNode(Node node)
Serialize the DOM node. This method is shared across XML, HTML and XHTML

content

Must be called by a method about to print any type of content. If the element was just opened, the opening tag is closed and will be matched to a closing tag. Returns the current element state with empty and afterElement set to false.
Returns:
The current element state

characters

protected void characters(java.lang.String text)
Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.
Parameters:
text - The text to print
unescaped - True is should print unescaped

getEntityRef

protected abstract java.lang.String getEntityRef(char ch)
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".
Parameters:
ch - Character value
Returns:
Character entity name, or null

serializeElement

protected abstract void serializeElement(Element elem)
Called to serializee the DOM element. The element is serialized based on the serializer's method (XML, HTML, XHTML).
Parameters:
elem - The element to serialize

serializePreRoot

protected void serializePreRoot()
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first. Instead such PIs and comments are accumulated inside a vector and serialized by calling this method. Will be called when the root element is serialized and when the document finished serializing.

printText

protected final void printText(char[] chars,
                               int start,
                               int length,
                               boolean preserveSpace,
                               boolean unescaped)
Called to print additional text with whitespace handling. If spaces are preserved, the text is printed as if by calling #printText(String) with a call to #breakLine for each new line. If spaces are not preserved, the text is broken at space boundaries if longer than the line width; Multiple spaces are printed as such, but spaces at beginning of line are removed.
Parameters:
text - The text to print
preserveSpace - Space preserving flag
unescaped - Print unescaped

printText

protected final void printText(java.lang.String text,
                               boolean preserveSpace,
                               boolean unescaped)

printDoctypeURL

protected void printDoctypeURL(java.lang.String url)
Print a document type public or system identifier URL. Encapsulates the URL in double quotes, escapes non-printing
Parameters:
url - The document type url to print

printEscaped

protected void printEscaped(char ch)

printEscaped

protected void printEscaped(java.lang.String source)
Escapes a string so it may be printed as text content or attribute value. Non printable characters are escaped using character references. Where the format specifies a deault entity reference, that reference is used (e.g. <).
Parameters:
source - The string to escape

getElementState

Return the state of the current element.
Returns:
Current element state

enterElementState

                                                                  java.lang.String localName,
                                                                  java.lang.String rawName,
                                                                  boolean preserveSpace)
Enter a new element state for the specified element. Tag name and space preserving is specified, element state is initially empty.
Returns:
Current element state, or null

leaveElementState

Leave the current element state and return to the state of the parent element. If this was the root element, return to the state of the document.
Returns:
Previous element state

isDocumentState

protected boolean isDocumentState()
Returns true if in the state of the document. Returns true before entering any element and after leaving the root element.
Returns:
True if in the state of the document

getPrefix

protected java.lang.String getPrefix(java.lang.String namespaceURI)
Returns the namespace prefix for the specified URI. If the URI has been mapped to a prefix, returns the prefix, otherwise returns null.
Parameters:
namespaceURI - The namespace URI
Returns:
The namespace prefix if known, or null

Xerces 3.1.1