Name Last modified Size Description
Parent Directory 09-Jan-2008 17:37 -
|
html-parser-19990912p2What is html-parserThe html-parser package is a variant language implementation of the Python's SGML parser (sgmllib.py), HTML parser (htmllib.py) and Formatter (formatter.py). Files
sgml-parser.rb - SGML Parser ClassThe sgml-parser.rb defines a class SGMLParser which serves as the basis for parsing text files formatted in SGML (Standard Generalized Mark-up Language). In fact, it does not provide a full SGML parser -- it only parses SGML insofar as it is used by HTML, and the module only exists as a base for the HTMLParser class. Please see <URL:http://www.python.org/doc/current/lib/module-sgmllib.html> for detail. html-parser.rb - HTML Parser ClassThe html-parser.rb defines a class HTMLParser which is a parser for HTML documents. Please see <URL:http://www.python.org/doc/current/lib/module-htmllib.html> for detail. formatter.rb - Formatter ClassThe formatter.rb defines 4 classes -- NullFormatter, AbstractFormatter, NullWriter and DumbWriter -- which is a generic output formatter and device interface. Please see <URL:http://www.python.org/doc/current/lib/module-formatter.html> for detail. htmltest.rb - HTML Parser Test ScriptThe htmltest.rb is a sample script using html-parser package. Usage: htmltest.rb [HTML_FILE]
ex.) How to install
or
AuthorTakahiro Maebashi <maebashi@iij.ad.jp> PackagerKatsuyuki Komatsu <komatsu@sarion.co.jp> History
|