/ HappyDoc3-r3_1 / happydoclib / docset / docset_TAL / TAL / HTMLParser.py / HTMLParser
Find tags and other markup and call handler functions.
Usage:
p = HTMLParser()
p.feed(data)
...
p.close()
Start tags are handled by calling self.handle_starttag() or
self.handle_startendtag(); end tags by self.handle_endtag(). The
data between tags is passed from the parser to the derived class
by calling self.handle_data() with the data as argument (the data
may be split up in arbitrary chunks). Entity references are
passed by calling self.handle_entityref() with the entity
reference as the argument. Numeric character references are
passed to self.handle_charref() with the string containing the
reference as the argument.
Methods
|
|
|
feed
|
feed ( self, data )
Feed data to the parser.
Call this as often as you want, with as little or as much text
as you want (may include
).
|
|
handle_startendtag
|
handle_startendtag (
self,
tag,
attrs,
)
- Overridable
- finish processing of start+end tag:
|
|
handle_entityref
|
handle_entityref ( self, name )
- Overridable
- handle entity reference
|
|
close
|
close ( self )
Handle any buffered data.
|
|
handle_comment
|
handle_comment ( self, data )
- Overridable
- handle comment
|
|
handle_starttag
|
handle_starttag (
self,
tag,
attrs,
)
- Overridable
- handle start tag
|
|
goahead
|
goahead ( self, end )
- Internal
- handle data as far as reasonable. May leave state
and data to be processed by a subsequent call. If
end is
true, force handling all data as if followed by EOF marker.
|
|
set_cdata_mode
|
set_cdata_mode ( self, endtag=None )
|
|
parse_comment
|
parse_comment (
self,
i,
report=1,
)
- Internal
- parse comment, return end or -1 if not terminated
|
|
get_starttag_text
|
get_starttag_text ( self )
Return full source of start tag: <...> .
|
|
__init__
|
__init__ ( self )
Initialize and reset this instance.
|
|
handle_decl
|
handle_decl ( self, decl )
- Overridable
- handle declaration
|
|
clear_cdata_mode
|
clear_cdata_mode ( self )
|
|
check_for_whole_start_tag
|
check_for_whole_start_tag ( self, i )
- Internal
- check to see if we have a complete starttag; return end
or -1 if incomplete.
Exceptions
|
|
AssertionError( "we should not get here!" )
|
|
|
parse_starttag
|
parse_starttag ( self, i )
- Internal
- handle starttag, return end or -1 if not terminated
|
|
handle_endtag
|
handle_endtag ( self, tag )
- Overridable
- handle end tag
|
|
handle_charref
|
handle_charref ( self, name )
- Overridable
- handle character reference
|
|
parse_pi
|
parse_pi ( self, i )
- Internal
- parse processing instr, return end or -1 if not terminated
|
|
unknown_decl
|
unknown_decl ( self, data )
|
|
reset
|
reset ( self )
Reset this instance. Loses all unprocessed data.
|
|
unescape
|
unescape ( self, s )
- Internal
- helper to remove special character quoting
|
|
handle_pi
|
handle_pi ( self, data )
- Overridable
- handle processing instruction
|
|
parse_endtag
|
parse_endtag ( self, i )
- Internal
- parse endtag, return end or -1 if incomplete
|
|
error
|
error ( self, message )
Exceptions
|
|
HTMLParseError(message, self.getpos() )
|
|
|
handle_data
|
handle_data ( self, data )
- Overridable
- handle data
|
|
|