org.lobobrowser.html.parser
Class HtmlParser

java.lang.Object
  extended by org.lobobrowser.html.parser.HtmlParser

public class HtmlParser
extends java.lang.Object

The HtmlParser class is an HTML DOM parser. This parser provides the functionality for the standard DOM parser implementation DocumentBuilderImpl. This parser class may be used directly when a different DOM implementation is preferred.


Field Summary
static java.lang.String MODIFYING_KEY
          A node UserData key used to tell nodes that their content may be about to be modified.
 
Constructor Summary
HtmlParser(org.w3c.dom.Document document, org.xml.sax.ErrorHandler errorHandler, java.lang.String publicId, java.lang.String systemId)
          Deprecated. UserAgentContext should be passed in constructor.
HtmlParser(UserAgentContext ucontext, org.w3c.dom.Document document)
          Constructs a HtmlParser.
HtmlParser(UserAgentContext ucontext, org.w3c.dom.Document document, org.xml.sax.ErrorHandler errorHandler, java.lang.String publicId, java.lang.String systemId)
          Constructs a HtmlParser.
 
Method Summary
static boolean isDecodeEntities(java.lang.String elementName)
           
 void parse(java.io.InputStream in)
          Parses HTML from an input stream, assuming the character set is ISO-8859-1.
 void parse(java.io.InputStream in, java.lang.String charset)
          Parses HTML from an input stream, using the given character set.
 void parse(java.io.LineNumberReader reader)
           
 void parse(java.io.LineNumberReader reader, org.w3c.dom.Node parent)
          This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.
 void parse(java.io.Reader reader)
          Parses HTML given by a Reader.
 void parse(java.io.Reader reader, org.w3c.dom.Node parent)
          This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MODIFYING_KEY

public static final java.lang.String MODIFYING_KEY
A node UserData key used to tell nodes that their content may be about to be modified. Elements could use this to temporarily suspend notifications. The value set will be either Boolean.TRUE or Boolean.FALSE.

See Also:
Constant Field Values
Constructor Detail

HtmlParser

public HtmlParser(org.w3c.dom.Document document,
                  org.xml.sax.ErrorHandler errorHandler,
                  java.lang.String publicId,
                  java.lang.String systemId)
Deprecated. UserAgentContext should be passed in constructor.

Constructs a HtmlParser.

Parameters:
document - A W3C Document instance.
errorHandler - The error handler.
publicId - The public ID of the document.
systemId - The system ID of the document.

HtmlParser

public HtmlParser(UserAgentContext ucontext,
                  org.w3c.dom.Document document,
                  org.xml.sax.ErrorHandler errorHandler,
                  java.lang.String publicId,
                  java.lang.String systemId)
Constructs a HtmlParser.

Parameters:
ucontext - The user agent context.
document - An W3C Document instance.
errorHandler - The error handler.
publicId - The public ID of the document.
systemId - The system ID of the document.

HtmlParser

public HtmlParser(UserAgentContext ucontext,
                  org.w3c.dom.Document document)
Constructs a HtmlParser.

Parameters:
ucontext - The user agent context.
document - A W3C Document instance.
Method Detail

isDecodeEntities

public static boolean isDecodeEntities(java.lang.String elementName)

parse

public void parse(java.io.InputStream in)
           throws java.io.IOException,
                  org.xml.sax.SAXException,
                  java.io.UnsupportedEncodingException
Parses HTML from an input stream, assuming the character set is ISO-8859-1.

Parameters:
in - The input stream.
Throws:
java.io.IOException - Thrown when there are errors reading the stream.
org.xml.sax.SAXException - Thrown when there are parse errors.
java.io.UnsupportedEncodingException

parse

public void parse(java.io.InputStream in,
                  java.lang.String charset)
           throws java.io.IOException,
                  org.xml.sax.SAXException,
                  java.io.UnsupportedEncodingException
Parses HTML from an input stream, using the given character set.

Parameters:
in - The input stream.
charset - The character set.
Throws:
java.io.IOException - Thrown when there's an error reading from the stream.
org.xml.sax.SAXException - Thrown when there is a parser error.
java.io.UnsupportedEncodingException - Thrown if the character set is not supported.

parse

public void parse(java.io.Reader reader)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Parses HTML given by a Reader. This method appends nodes to the document provided to the parser.

Parameters:
reader - An instance of Reader.
Throws:
java.io.IOException - Thrown if there are errors reading the input stream.
org.xml.sax.SAXException - Thrown if there are parse errors.

parse

public void parse(java.io.LineNumberReader reader)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Throws:
java.io.IOException
org.xml.sax.SAXException

parse

public void parse(java.io.Reader reader,
                  org.w3c.dom.Node parent)
           throws java.io.IOException,
                  org.xml.sax.SAXException
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.

Parameters:
reader - A document reader.
parent - The root node for the parsed DOM.
Throws:
java.io.IOException
org.xml.sax.SAXException

parse

public void parse(java.io.LineNumberReader reader,
                  org.w3c.dom.Node parent)
           throws java.io.IOException,
                  org.xml.sax.SAXException
This method may be used when the DOM should be built under a given node, such as when innerHTML is used in Javascript.

Parameters:
reader - A LineNumberReader for the document.
parent - The root node for the parsed DOM.
Throws:
java.io.IOException
org.xml.sax.SAXException


SourceForge.net LogoCopyright © 2005, 2006, 2007 The Lobo Project. All Rights Reserved.
[Cobra Project Home]