Change history of txt2html


  • bugfix: reserved characters in titles created with --titlefirst are now escaped properly.
  • bugfix: when preformatting entire document, each line was getting its own <PRE></PRE> container (introduced with explicit preformatting feature in 1.26).
  • dict: added some characters to those allowed in http urls (=&;,).
  • dict: added "-" to allowed characters within *emphasized-pattern*.


  • Changed names of default link dictionaries to txt2html.dict

1.26 (not released)

  • Added -8 (for 8-bit-clean) to disable conversion of non-ASCII characters to their corresponding Latin-1 character entities.
  • Added -pm to allow explicit marking of preformatted text in source
  • Changes => to , in mapping, to stay compatible with Perl 4
  • Added debug flag 4, for observing link rules in action
  • Fixed length checking bug in header underline analysis
  • Change a regexp so Perl 5.6 doesn't complain.
  • No longer add space after <LI> tags
  • Allow unindented lists to start after CAPS lines
  • Use · as a bullet character
  • Fixed bug that dropped a character when certain actions were taken on the last line of input that didn't end with a newline.
  • Added more aggressive regexps for _underlined_ and *emphasized* text.
  • Improved character markup rules
  • Added link rule for news URLs. (This must have been accidentally deleted at some point.)
  • Added link rule for common explicit url markup: <URL:foo>


  • Changed the official home page to (the old page will have a working redirect indefinitely.)
  • Added a LICENSE to the distribution. (modified BSD-style)
  • When no title is specified, an empty title element is inserted. (The old behavior was to omit the title element, which is forbidden by the spec.)
  • Made heading anchors appear inside the heading, rather than surrounding it (which is forbidden by the HTML spec)
  • Changed the DTD name
  • Added the --linkonly option so people can use the links dictionary feature without doing any other markup. This is useful for adding links to HTML fragments or documents.
  • Added the --prepend_body option for prepending HTML to the body.
  • Made in_link_context smarter so it won't link on attributes or tag names. (This is good for adding hyperlinks, but may screw up some clever uses of the linking code.)
  • Added link rules for _underlined text_ and *emphasized text*
  • Added --noescapechars to suppress converting "&" "<" and ">" into "&amp;" "&lt;" and "&gt;"
  • Changed pattern rules to handle non-ascii letters properly in matching patterns.
  • Added conversion of non-ascii letters into character entities.
  • Lots of upgrades to the links dictionary patterns


  • Changed behavior of custom headers to something much more useful: Header levels are assigned by regex in order seen. When a line matches a custom header regex, it is tagged as a header. If it's the first time that particular regex has matched, the next available header level is associated with it and applied to the line. Any later matches of that regex will use the same header level.
  • Added the -EH / --explicit-headings option
  • Added some unnecessary initialization to avoid warnings when perl is run with the -w switch.


  • Added handling for when the consistent formatting of numbered lists is the position of the non-numeric character, not the amount of whitespace preceding the number. (The numbers grow to the left instead of the right.)


  • Fixed bug in unhyphenation
  • Changed HTML version in default doctype line to 3.2


  • Added <META NAME="generator" CONTENT="txt2html v1.21">


  • Added DOCTYPE tag and --doctype options.
  • Syntax change to get rid of Perl 5 warning
  • Added ability to use the first line of the text as the title
  • Fixed some (unused) grossness in links dict file


  • Added --append_head
  • Mail and News name anchor surrounds just the first word ("Newsgroups:" or "From"), and not the whole line. That way, newsgroup names and email addresses get HREF'd as normal.


  • Cleaned up nested list handling & fixed a bug under Perl 5.
  • Changed a couple minor things to get rid of some of the Perl 5 warnings.


  • Lists can start even when not indented and not preceded by a blank line if the previous line was short or a header.
  • New flag "o" added for dictionary entries. Specifies that the link should only be done the first time a match is found.


  • Added anchoring of custom headers
  • Took the changelog out of the script
  • Tweaked $line_indent in sub liststuff
  • Insert <P> before each mail/news message


  • Fixed options handling for -e/+e , -r
  • Added "Newsgroups:" to trigger mail headers
  • Fixed anchor naming
  • took out -T option, since it isn't implemented yet. Whoops..
  • Fixed bug in endpreformat


  • Fixed +l/--nolink option handling
  • Fixed major bug in dynamic_make_dictionary_links that allowed nested links under some circumstances.


  • Fixed usage message so it matches options. (whoops)
  • Added custom heading style feature


  • Fixed bug in heading regexp
  • Changed underline tolerance parameters from min & max length difference to length difference & offset difference
  • Centralized line reading, added handling of DOS carriage returns
  • Switched to heading style stack. Styles still very limited.
  • Changed heading anchor names from a simple count to a hierarchical section number.


  • Blank lines are never considered underlined
  • Shortline breaking slightly more intelligent (or at least different)
  • Paragraph breaks much more intelligent
  • Lowercased tags. Style is so fickle.
  • Added links dictionaries, link making, etc.
  • Allow repeated bullet chars for unordered lists. (Tiny mod to regexp)
  • switched order of caps & liststuff in main()
  • improved untabify() so it converts the whole line, not just beginning
  • split up all lines >79 characters to avoid common downloading error (people would sometimes copy the script off the display, inadvertently adding a few newlines in bad places in the code)
  • Handles option "--" now.
  • Accepts named files as input as alternative to stdin
  • Deals with stdin properly (no more extra EOFs needed)
  • Improved mail handling


  • Added --extract, etc.


  • Changed from #!/usr/local/bin/perl to the more clever version in the man page. (How did I manage not to read this for so long?)
  • Swapped hrule & header back to handle double lines. Why should this order screw up headers?


  • put mail_anchor back in. (Why did I take this out?)
  • Finally added handling of lettered lists (ordered lists marked with letters)
  • Added title option (--title, -t)
  • Shortline now looks at how long the line was before txt2html started adding tags. ($line_length)
  • Changed list references to scalars where appropriate. (@foo[0] -> $foo[0])
  • Added untabify() to homogenize leading indentation for list prefixes and functions that use line length
  • Added "underline tolerance" for when underlines are not exactly the same length as what they underline.
  • Added error message for unrecognized options
  • removed \w matching on --capstag
  • Tagline now removes leading & trailing whitespace before tagging
  • swapped order of caps & heading in main loop
  • Cleaned up code for speed and to get rid of warnings
  • Added more restrictions to something being a mail header
  • Added indentation for lists, just to make the output more readable.
  • Fixed major bug in lists: $OL and $UL were never set, so when a list was ended "</UL>" was *always* used!
  • swapped order of hrule & header to properly handle long underlines


  • Added to comments in options section
  • renamed blank to is_blank
  • Page break is converted to horizontal rule <HR>
  • moved usage subroutine up top so people who look through code see it sooner


  • Creates anchors at each heading


  • Fixed minor bug in Headers
  • Preformatting can be set to only start/stop when TWO lines of [non]formatted-looking-text are encountered. Old behavior is still possible through command line options (-pb 1 -pe 1).
  • Can preformat entire document (-pb 0) or disable preformatting completely (-pe 0).
  • Fixed minor bug in CAPS handling (paragraph breaks broke)
  • Puts paragraph tags *before* paragraphs, not just between them.


  • Allow ':' for numbered lists (e.g. "1: Figs")
  • Whitespace at end of line will not start or end preformatting
  • Mailmode is now off by default
  • Doesn't break short lines if they are the first line in a list item. It *should* break them anyway if the next line is a continuation of the list item, but I haven't dealt with this yet.
  • Added action on lines that are all capital letters. You can change how these lines get tagged, as well as the minimum number of consecutive capital letters required to fire off this action.


  • Tiny bugfix in unhyphenation


  • Added unhyphenation
Last modified: Tue May 23 13:14:41 BST 2000