| iMatix home page | htmlpp home page | << | < | > | >> |
htmlpp Version 4.2a |
Htmlpp is a preprocessor for HTML files, and is intended to simplify the task of maintaining large sets of HTML documents. You provide htmlpp with a document that is a mix of HTML-tagged text and htmlpp commands. Htmlpp generates a set of HTML files from that document.
To run htmlpp, use the following syntax:
htmlpp [-option...] filename ...
Where filename is assumed to have an extension '.htp' if necessary. You can use these command-line options:
htmlpp -page 1 xxxxYou can generate a series of specific pages too:
htmlpp -page 1,9-10,15 xxxxYou can also refer to page file names, instead of page numbers (but not using the 'nn-nn' syntax:
htmlpp -page justhis.htm,andthat.htm xxxx
-set name=valueFor example,
htmlpp -set BASE=temp myfile.txtUnder DOS, you must enclose name=value in quotes, e.g.:
htmlpp -set "BASE=temp" myfile.txt
All command line options can be shortned to their significant letters, e.g. '-d' is the same as '-debug'.
Htmlpp replaces symbols in command lines and HTML text. You can specify a symbol in various ways:
<A HREF="$(name)">name</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) which is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
<A HREF="$(name)" attributes>name</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
<A HREF="$(name)">label</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
<A HREF="$(name)" attributes>label</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
<A HREF="$(name)">$(name)</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
<A HREF="$(name)" attributes>$(name)</A>.If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the HTML 4.0 "hreflang=xx" attribute is automatically added in the <A...> tag.
You can define symbols in terms of symbols: $($(name)) is quite okay, if you know what you are doing. Htmlpp inserts symbols from right to left in the line.
Symbols are of various types
Htmlpp provides these standard symbols for use at any point in the document:
1 1.1 1.2 1.2.1 1.2.2 ... etc.Htmlpp automatically manages the numbering of header levels. You are, however, limited to the 'dotted number' syntax.
In addition, htmlpp will include the current environment symbols if you run it with the -env option. You can use this (although I don't see the utility immediately) to redefine any of the standard symbols such as $(EXT). Remember that you can also access any of the environment symbols using the %(...) syntax; e.g. %(PATH).
A htmlpp command starts with a dot, in column 1, followed by a keyword. You can put spaces between the dot and the keyword. To continue the command line over the next line, end the line with a hyphen (though you need to at least put the dot and the keyword on the same line. Commands can be in upper- or lower-case: .endblock and .EndBlock are equivalent.
These are the commands that htmlpp understands:
.define count = 1 .echo $(count) .define count = $(count) + 1 .echo $(count)Of course it helps to know that htmlpp will evaluate all variables before passing the expression to Perl to work out. So, the second .define is evaluated as '1 + 1'. If you decide to rely on Perl (a good bet for now), you can use the .define = command to execute shell commands, e.g.:
.if $(PASS) . define junk = system "rm *.htm"; .endifIf you append .xx to the symbol name, where xx is a two-characters language code, you can afterwards use the variable as $(symbol) without writting the language code, provided $(USE_LANG)=1 and $(LANG)=xx. You should use standardized ISO-639 language codes.
.define INC++ ""Note that the empty string is treated as zero; the next time the symbol will be '1'. You can also use '--' after the symbol name to subtract one from its value each time it is used. You can stick the '++' or '--' before the symbol name: then the symbol is incremented or decremented before its value is taken.
<H1>$(TITLE)</H1>
.if -f myfile.htmAn .if block must be entirely in one line.
.if $(number) == 0
.if $(number) != 1
.if $(number) > 2
.if $(string) eq "value"
.if $(string) ne "value"
.define LOCAL i:/site: .define SERVER http://www.imatix.comThe directory must be relative to either of these two. It should start with '/' but not end with '/'. You can specify zero or more filenames or wildcards (htmlpp accepts * and ?, according to UNIX rules). If you specify no filespecs, htmlpp assumes you mean '*'. The filespecs can include PERL regular expressions: place the filespec between double quotes, e.g. to match all files with 'doc' or 'txt' somewhere in the name: .build dir /pub "doc|txt". An example might help:
.define .txt Text file .define .htm HTML document .define .zip ZIP archive .block dir_open <PRE> .block dir_entry $(*DIR_HREF="$(DIR_NAME)") $(DIR_SIZE) $($(DIR_EXT)) .block dir_close </PRE> .endblockNote the sneaky double-derefencing of $(DIR_EXT) which translates the file extension into a comment like 'Text file'. I usually stick all such .defines in a separate .include file, filetype.def.
Each query criterium is defined by four pipe-delimited fields. You can use as many criteria as you want. The fields are the
value OPERATOR database_field_value
date
, number
, or
string
on
.for row in %database "\\|" "COMMENT:" "on" "on" "$(person)|1|=|string" .page "$(5)" = "$(2) $(3)'s Homepage" <UL> <LI>E-mail: <A href="mailto:$(4)">$(4)</A> <LI>Office: $(5) [...] .endforThen calling 'htmlpp -set person=personcode template_file' for each person will produce his personal page.
Macros are a shorthand way to produce HTML tags and other constructs. This is how I define a macro 'H3':
.macro H3 <H3>$*</H3>
I use all uppercase names for macros, but this is just a convention, since the case is not important. We can use a macro like H3 in three ways:
.H3 some text
or
<!--.H3 some text-->
or
<.H3 some text>
The first form is good for titles and other constructs that come naturally on a line by themselves. Since it uses a syntax similar to htmlpp commands, there is a certain danger that a macro will conflict with some future command. This is just too bad; the alternative of inventing yet another syntax for macros was (for me) a worse choice. In any case, htmlpp will warn you if you try to define a macro that already exists as a command. The second form is compatible with HTML editors and some other HTML preprocessors, but is frankly a pain to type. The third form is good for mark-up tags. The second and third forms suffer from one problem: the whole thing has to come on a single line.
When you use a macro like this: <.H3 some text> you are supplying arguments. Here we supply two, 'some' and 'text'. You can refer to these as $1 and $2 inside the macro definition, or together as $*. Htmlpp can handle quotes correctly, so <.H3 "some text"> only supplies one argument, $1.
The $+ symbol expands to anything left over after $1, $2, etc. For instance, if you refer to $1 and $3 in the macro body, $+ refers to $4 and any remaining arguments.
The $# symbol expands to the number of macro arguments.
You can define a macro with a section that repeats for each argument. This is useful if you don't know in advance how many arguments you are going to have. For instance, the standard .THEAD macro generates a table heading for one, two, three, or more columns. You specify the repeating section as {...$n...}. The text between '{' and '}' is repeated for each argument; "$n" (dollar sign, small 'n') is replaced by the argument value.
To use multi-word arguments, enclose them in quotes.
When a macro refers to a variable using $(xxx), this will be expanded as soon as the macro is expanded. Usually this is what you expect, but sometimes you need the variable to be expanded in the next pass, for instance if you generate the .define in the same pass. In this case, escape the variable: $\(xxx).
The file macro.def that comes with htmlpp defines a set of standard macros. You can define multiline macros that include other commands, like .if and .include.
Supported character sets are ISO-8859-1 and MS-DOS (codepage 850). In general you can use ISO-8859-1 both for Unix Latin-1 and Windows 1250. You can define the character set through the -charset command-line option or let htmlpp do a little testing of the wind to figure-out if it's running under a Unix or a DOS system (Windows testing not supported). If you use htmlpp on a Mac, or on documents encoded using another character set it won't work. Basically htmlpp handles MS-DOS accents if there is an environment variable 'COMPSPEC' defined, and ISO-8859-1 accents if there is a file called "/etc/passwd" on the system.
If you use any character which is not on ISO-8859-1 or MS-DOS CP850 you will find that it comes-out as '?' (not found). If you have the HTML metachar for the character (which must be a Unicode numeric reference rather than an entity reference if it is outside ISO-8859-1) and the octal ASCII code for the character set you are using, please send it to me.
.define USE_LANG 1 .define home.en "http://www.myserver.com/english/index.html" .define home.es "http://www.myserver.com/spanish/index.html" .define home.fr "http://www.myserver.com/french/index.html"Now if you use the symbol $(home), htmlpp will
If you want to make a link to a variable which is in a different language than the current one in $(LANG) you can use the full symbol name and Htmlpp will add for you the 'hreflang=xx' attribute:
.define $(LANG) es $(*home=Home) --> <A href="http://www.myserver.com/spanish/index.html">Home</A> $(*home.es=Home) --> <A href="http://www.myserver.com/spanish/index.html">Home</A> $(*home.en=Home) --> <A href="http://www.myserver.com/english/index.html" hreflang=en>Home</A>
You can use the following trick. Write your source file as:
.if ("$(LANG)" eq "en") [... version in english ...] .elsif ("$(LANG)" eq "es") [... version in spanish ...] .elsif ("$(LANG)" eq "fr") [... version in french ...] .endifThen to process each language invoke htmlpp as 'htmlpp -set LANG=xx filename'. You can also write a simple shell script to process the three languages at once by calling three times to Htmlpp.
Recognising that a True Guru does not have time to painfully mark-up large HTML documents, htmlpp includes a basic text-to-HTML converter. You can invoke this as a preprocessing phase to the normal htmlpp process. Right now, this is an either-or choice; you either use htmlpp commands in a HTML document, or a text document and guru mode, but not a mixture of the two modes. (Release 3.1 of htmlpp tried to make this work, but that did not last long :\ )
You can, usefully, use htmlpp's guru mode to mark-up a document, then fine-tune it by hand.
To use guru mode, run htmlpp with the '-guru' option:
htmlpp -guru filename
Guru mode works by recognising layout, and converting this to HTML. I've tried to keep a balance between features and complexity, to give you something useful without becoming too formal (which is what HTML is for). Basically, guru mode relies on layout rules that also help to make the text readable in any case. For example, blank lines and indentation are significant in most places. One consequence of this is that the plain text file is very readable even before it is HTML'd (assuming you do your bit to help things.)
In guru mode, htmlpp reads an input text file (with any name and extension except '.hpp') and creates an output file with the same name and the extension '.hpp'. It then processes this file as it would any normal input file. The '.hpp' file remains afterwards, so you can use it as the basis for further refinement if wanted. (You should call it something else, to avoid embarrasing mistakes.)
The file 'guru.def' is always inserted at the start of the newly-created file. You can modify this file as wanted, to tune the results of guru mode. You cannot choose another name for this file other than by changing htmlpp's source code, which I don't recommend.
Htmlpp looks for a file called 'guru.fmt' which may exist and which may redefine the various HTML tags it uses. A file 'guru_opt.fmt' is supplied in the htmlpp distribution; rename or copy this to 'guru.fmt' and change any values you want to (I'd suggest you remove anything that does not change, just to make things clear). I've made it work in this way so that if you reinstall htmlpp, you don't loose your work.
Htmlpp handles three levels of headers, H1, H2, and H3. In the text these look like this:
Chapter Header ************** Section Header ============== Subsection Header -----------------
The line following the header text must start with 3 or more asterisks, equals, or hyphens. There is no way to specify H4 or other headers. I recommend that you start the document with a chapter header.
You can also request a horizontal rule (<HR>) by putting four or more dots on a line by themselves:
....
The header text line must come after a blank line, or at the start of the document.
If your document contains at least two chapters, htmlpp will insert a table of contents before the second chapter header. This works best if the first chapter is empty or contains a brief text to introduce the document. Htmlpp inserts the table of contents by adding a section header called 'Table of Contents', and then a line '.include contents.def', in the normal manner. You should not call the first chapter 'Table of Contents'.
Htmlpp inserts a '.page' command before each chapter header. Therefore, use chapter headers wisely to break the document into usable pages.
The guru.def file normally includes 'prelude.def', which defines page headers and footers for the document. You will normally tune these for any project -- the supplied files contain references to iMatix URLs that may not be appropriate for your work. I like to use the same headers and footers (the same prelude.def) for all the files in a project, including those I that use guru mode.
A paragraph is anything following a blank line that does not look like something else. Basically, any plain text following a blank line is given a <P> tag. Note however the exceptions that follow...
If a line is indented by 4 or more spaces, or a tab, htmlpp treats the line as 'preformatted' text and inserts a <PRE> tag. You can mix blank lines with preformatted text.
A paragraph starting with a hyphen and a space is considered to be a bulleted list item. A paragraph starting with a digit and a dot and optionally a space is considered to be a numbered list item. You can put blank lines between list items, but it's not necessary. Cosmetically, when list items are short, blank lines are disturbing. But when list items are several lines, blank lines make the text more readable. Either way, htmlpp is happy.
A definition list is a line ending in ':' followed by some lines indented by one or more spaces. For example:
Definition: Explanation of definition.
You can put blank lines between definition items, but again, it's a matter of cosmetics. There should be a blank line before the first definition item, however.
Tables are one of the real pains of HTML markup, in my opinion. Here htmlpp tries to solve the most common case; a two-column table consisting of a term or value in one column, and an explanation in the second column.
A table can start with a header, which is a line like this:
Some column: Followed by some explanation:
Here, the colons (':') are important. Htmlpp also wants a captial letter at the start of both phrases, and a space after the first colon. The table header is optional; you can start immediately with table items. Either way, htmlpp needs a blank line before the table. A table item looks like this:
Some_word: Followed by some explanation which can come on several lines.
The first column must be a single word - if you want several words, use underlines. Htmlpp replaces these by spaces. The explanation can come on several lines, which must be indented by one or more spaces.
To insert a figure, use one of these conventions:
[Figure filename: caption] [Figure "filename": caption]
Htmlpp inserts a figure caption, numbering the figures in a document from 1 upwards. The caption is followed by an <IMG> tag to display the file. You can use a URI (a path) as the filename, or an URL (with a host name specifier); you must put an URL in quotes. My preference is to put image files locally with the HTML files, and use a simple filename without a path. This is just easier to manage and lets you put the HTML files plus images in any directory. If htmlpp can find the image you specify, and it's a .GIF or .JPG file, it will insert the WIDTH= and HEIGHT= tags automatically.
To insert a plain image, omit the 'Figure' keyword. For example, these are all examples of valid images:
[Figure somefile.gif: caption] [somefile.gif: caption] [Figure somefile.gif] [somefile.gif]
If you use <name@address>, this is converted into a mailto: URL hyperlink. If you use <http://address/document> -- or any other URL -- this is converted into a hyperlink as well. You can follow the URL by ':description' if you like, e.g. <http://www.imatix.com:iMatix Corporation's Site>. You can also refer to local files using the syntax </localfile[:description]>.
Htmlpp does not presently allow links within the document or to other documents.
Since you're not typing HTML, htmlpp replaces <, > and & by HTML metacharacters. < and > are used to indicate hyperlinks.
Htmlpp provides a number of intrinsic functions that you can use in your text. The syntax for using an intrinsic function is:
&function-name(arguments)
This function: | Does this: |
---|---|
&date("picture", date) | Format specified date using picture |
&date("picture", date, lc) | Format specified date using picture and language code. |
&date("picture") | Format current date using picture |
&date() | Return current date value |
&time() | Format current time as hh:mm:ss |
&week day([date]) | Get day of week, 0=Sunday, 6=Saturda |
&year week([date]) | Get week of year, 1 is first full we |
&julian date([date]) | Get Julian date for date |
&lillian date([date]) | Get Lillian date for date |
&date to days(date) | Convert yyyymmdd to Lillian date |
&days to date(days) | Convert Lillian date to yyyymmdd |
&future date(days,[date]) | Calculate future date |
&past date(days,[date]) | Calculate past date |
&date diff(date1,date2) | Calculate differences between dates |
&image height("image.ext") | Get image height (GIF, JPEG) |
&image width("image.ext") | Get image width (GIF, JPEG) |
&file size("filename",arg) | Get size of file: optional arg K or |
&file date("filename") | Get date of file |
&file time("filename") | Get time of file as hh:mm:ss |
&normalise("filename") | Normalise filename to UNIX format |
&system("command") | Get result of some system utility |
&upper("string") | Convert string to uppercase text |
&lower("string") | Convert string to lowercase text |
&pageref("page","title") | Build link for page index |
&relpath("to") | Get relative path from current document->to |
&relpath(["from"],"to") | Get relative path from->to |
Syntax:
&date(picture, value) &date(picture, value, language) &date(picture) &date()
Without a picture, returns the current date. With a picture, formats the current date according to a picture that you specify. You can optionally supply a date value in the standard 8-digit format; YYYYMMDD (as returned by &date()), or use 0 to indicate today's date. You can optionally follow the picture and value by a language code; the values currently accepted are "es" for Spanish, "fr" for French, and "dk" for Danish. Anything else is taken to mean English. If no language is specified, $(LANG) is used by default. The picture can consist of any mixture of these elements:
cc | century 2 digits, 01-99 |
y | day of year, 1-366 |
yy | year 2 digits, 00-99 |
yyyy | year 4 digits, 100-9999 |
m | month, 1-12 |
mm | month, 01-12 |
mmm | month, 3 letters |
mmmm | month, full name |
MMM | month, 3 letters, ucase |
MMMM | month, full name, ucase |
d | day, 1-31 |
dd | day, 01-31 |
ddd | day of week, Sun-Sat |
dddd | day of week, Sunday-Saturday |
DDD | day of week, SUN-SAT |
DDDD | day of week, SUNDAY-SATURDAY |
w | day of week, 1-7 (1=Sunday) |
ww | week of year, 1-53 |
q | year quarter, 1-4 |
\x | literal character x |
other | literal character |
Examples:
.echo &date() --> Nov 13, 99 .echo &date('mm d, yy') --> Dec 2, 98 .echo &date('d mmm, yy') --> 2 Dec, 98 .echo &date("yymd") --> 9812 2 .echo &date("yyyymmdd") --> 19981202 .echo &date("d \de mmmm \de yyyy", 0, "es") --> today's date in Spanish
Syntax:
&time()
Formats the current time in the same way as the $(TIME) symbol. The difference is that $(TIME) is set when htmlpp starts working; &time() reflects the current time.
Syntax:
&week_day() &week_day(date)
Returns the day of the week for the specified date, or for the current date if no argument is given. Day 0 is Sunday; day 6 is Saturday.
Syntax:
&year_week() &year_week(date)
Returns the week of the year for the specified date, or for the current date if no argument is given. Week 1 is the first full week, starting with a Sunday.
Syntax:
&julian_date() &julian_date(date)
Returns the Julian date for the specified date, or for the current date if no argument is given. Day 1 is January 1.
Syntax:
&lillian_date() &lillian_date(date)
Returns the Lillian date for the specified date, or for the current date if no argument is given. This is the number of days since a starting (but unspecified) epoch (which in fact is around 1582).
Syntax:
&date_to_days(date)
Returns the Lillian date for the specified date. This function is really a the same as &lillian_date() except that you must supply a date argument. It's provided for orthogonality with &days_to_date().
Syntax:
&days_to_date(days)
Converts a Lillian date back into a normal date in the form yyyymmdd. You can use this function (in combination with the reverse function, &date_to_days()) to calculate past and future dates.
Syntax:
&future_date(days) &future_date(days,date)
Calculates a date at some point in the future. For instance, 19981109 will produce tomorrow's date. If the date argument is not provided, calculates from today.
Syntax:
&past_date(days) &past_date(days,date)
Calculates a date at some point in the past. For instance, 19981107 will produce yesterday's date. If the date argument is not provided, calculates from today.
Syntax:
&date_diff(date) &date_diff(date1,date2)
Calculates the difference between two dates, in days. The calculation is date1 - date2. If date2 is not supplied, calculates using today, and will therefore return a positive value if date is in the future, and a negative value if date is in the past.
Syntax:
&image_width(filename)
Returns the width of the specified image, which can be a GIF or JPEG file, in any of the common formats (including progressive encoding). The width is returned in pixels.
Syntax:
&image_height(filename)
Returns the height of the specified image, which can be a GIF or JPEG file, in any of the common formats (including progressive encoding). The height is returned in pixels.
Syntax:
&file_size(filename) &file_size(filename, K) &file_size(filename, M)
Returns the size of the specified file. If the second argument is K or M, calculates the size in Kb or Mb as appropriate. Always returns an integer value.
Syntax:
&file_date(filename)
Returns the date of the specified file, as an 8-digit value, YYYYMMDD.
Syntax:
&file_time(filename)
Returns the time of the specified file, as a string, HH:MM:SS.
Syntax:
&normalise(filepath)
Returns the filepath in a UNIX-style format. You can use this, for instance, under MS-DOS, when filenames taken from (e.g.) the environment contain back slashes which can cause problems. Replaces \ by / and spaces by underlines.
Syntax:
&system(string)
Returns the result of some system utility. For instance:
.define SERVER http://&system("hostname")
Syntax:
&upper(string)
Returns the string in uppercase letters.
Syntax:
&lower(string)
Returns the string in lowercase letters.
Syntax:
&pageref("page","title")Does the same as this htmlpp code: .if "name" eq "index3.htm" title .else <A HREF="page">title</A> .endif
This function strips-off any HTML tags that you put around the title text when it uses it to build a link. So, you can do this kind of thing, which I often use to build an index in the page footer: .block index_entry &pageref("$(INDEX_PAGE)","<EM>$(INDEX_TITLE)</EM>") .endblock
Syntax:
&relpath(["from"], "to")
Returns the relative path from 'from' to 'to'. If only one argument is specified, the current HTML page is used as 'from'. For example:
&relpath("john/peter/david/me.html","john/henry/you.html")would return "../../henry/you.html". You can use this to create a web site with pure relative references to other pages. Remember that you don't need to use this function with $(*name) links if $(USE_RELPATH) is set to 1. Note that if you use relative references you can test and use the HTML pages on a local hard disk as well as on the server without changes.
Since version 2.00, htmlpp uses a multipass technique to allow embedded blocks. For example, you can place .include actions in the header or footer blocks, or define your own blocks that have .define, .page, and other actions.
Htmlpp handles this using the following rules:
One consequence of this is that htmlpp needs a minimum of 3 passes to fully process a document, one to collect all the titles; one to insert page headers and footers, and a last one to break the text into individual pages. If any genius can help me reduce this to two (or one!) pass, go ahead.
The upside is that you can do really funky stuff in headers and footers: for instance, the htmlpp pages build a document index in the footer, switching hyperlinks on and off to indicate the current page in the index.
To see what htmlpp is doing with its passes, use the -debug option, like this:
htmlpp -debug filenameThis leaves a number of .wrk files lying around; these contain the result of each pass.
| << | <
| > | >>
| htmlpp - The HTML Preprocessor | Installing Htmlpp | Getting Started | Htmlpp Reference | Frequently Asked Questions | Other Information |
Copyright © 1996-97 iMatix |