% % Copyright (c) 1996, Preston Briggs % All rights reserved. % % Redistribution and use in source and binary forms, with or without % modification, are permitted provided that the following conditions % are met: % % Redistributions of source code must retain the above copyright notice, % this list of conditions and the following disclaimer. % % Redistributions in binary form must reproduce the above copyright notice, % this list of conditions and the following disclaimer in the documentation % and/or other materials provided with the distribution. % % Neither name of the product nor the names of its contributors may % be used to endorse or promote products derived from this software without % specific prior written permission. % % THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS % ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT % LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS % FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS % OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, % EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, % PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR % PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY % OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING % NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS % SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. % % Notes: % Update on 2011-12-16 from Keith Harwood. % -- @@s in scrap supresses indent of following fragment expansion % -- @@ in text is expanded % Update on 2011-04-20 from Keith Harwood. % -- @@t provide fragment title in output % Changes from 2010-03-11 in the Sourceforge revision history. -- sjw % Updates on 2004-02-23 from Gregor Goldbach % % -- new command line option -r which will make nuweb typeset each % NWtarget and NWlink instance with \LaTeX-commands from the hyperref % package. This gives clickable scrap numbers which is very useful % when viewing the document on-line. % Updates on 2004-02-13 from Gregor Goldbach % % -- new command line option -l which will make nuweb typeset scraps % with the help of LaTeX's listings packages instead or pure % verbatim. % -- man page corrections and additions. % Updates on 2003-04-24 from Keith Harwood % -- sectioning commands (@@s and global-plus-sign in scrap names) % -- @@x..@@x labelling % -- @@f current file % -- @@\# suppress indent % This file has been changed by Javier Goizueta % on 2001-02-15. % These are the changes: % LANG -- Introduction of \NW macros to substitue language dependent text % DIAM -- Introduction of \NWsep instead of the \diamond separator % HYPER -- Introduction of hyper-references % NAME -- LaTeX formatting of macro names in HTML output % ADJ -- Adjust of the spacing between < and a fragment name % TEMPN -- Fix of the use of tempnam % LINE -- Synchronizing #lines when @@% is used % MAC -- definition of the macros used by LANG,DIAM,HYPER % CHAR -- Introduce @@r to change the nuweb meta character (@@) % TIE -- Replacement of ~ by "\nobreak\ " % SCRAPs-- Elimination of s % DNGL -- Correction: option -d was not working and was misdocumented % --after the CHAR modifications, to be able to specify non-ascii characters % for the scape character, the program must be compiled with the -K % option in Borland compilers or the -funsigned-char in GNU's gcc % to treat char as an unsigned value when converted to int. % To make the program independent of those options, either char % should be changed to unsigned char (bad solution, since unsigned % char should be used for numerical purposes) or attention should % be payed to all char-int conversions. (including comparisons) % --2002-01-15: the TILDE modificiation is necessary because some ties % have been introduced in version 0.93 in troublesome places when % the babel package is used with the spanish.ldf option (which makes % ~ an active character). % --2002-01-15: an ``s'' was being added to the NWtxtDefBy and % NWtxtDefBy messages when followed by more than one reference. \documentclass[a4paper]{report} \newif\ifshowcode \showcodetrue \usepackage{latexsym} %\usepackage{html} \usepackage{listings} \usepackage{color} \definecolor{linkcolor}{rgb}{0, 0, 0.7} \usepackage[% backref,% raiselinks,% pdfhighlight=/O,% pagebackref,% hyperfigures,% breaklinks,% colorlinks,% pdfpagemode=None,% pdfstartview=FitBH,% linkcolor={linkcolor},% anchorcolor={linkcolor},% citecolor={linkcolor},% filecolor={linkcolor},% menucolor={linkcolor},% pagecolor={linkcolor},% urlcolor={linkcolor}% ]{hyperref} \setlength{\oddsidemargin}{0in} \setlength{\evensidemargin}{0in} \setlength{\topmargin}{0in} \addtolength{\topmargin}{-\headheight} \addtolength{\topmargin}{-\headsep} \setlength{\textheight}{8.9in} \setlength{\textwidth}{6.5in} \setlength{\marginparwidth}{0.5in} \title{Nuweb Version 1.62 \\ A Simple Literate Programming Tool} \date{} \author{Preston Briggs\thanks{This work has been supported by ARPA, through ONR grant N00014-91-J-1989.} \\ {\sl preston@@tera.com} \\ HTML scrap generator by John D. Ramsdell \\ {\sl ramsdell@@mitre.org} \\ Scrap formatting by Marc W. Mengel \\ {\sl mengel@@fnal.gov} \\ Continuing maintenance by Simon Wright \\ {\sl simon@@pushface.org} \\ and Keith Harwood \\ {\sl Keith.Harwood@@vitalmis.com}} \begin{document} \pagenumbering{roman} \maketitle \tableofcontents \chapter{Introduction} \pagenumbering{arabic} In 1984, Knuth introduced the idea of {\em literate programming\/} and described a pair of tools to support the practise~\cite{Knuth:CJ-27-2-97}. His approach was to combine Pascal code with \TeX\ documentation to produce a new language, \verb|WEB|, that offered programmers a superior approach to programming. He wrote several programs in \verb|WEB|, including \verb|weave| and \verb|tangle|, the programs used to support literate programming. The idea was that a programmer wrote one document, the web file, that combined documentation (written in \TeX~\cite{Knuth:ct-a}) with code (written in Pascal). Running \verb|tangle| on the web file would produce a complete Pascal program, ready for compilation by an ordinary Pascal compiler. The primary function of \verb|tangle| is to allow the programmer to present elements of the program in any desired order, regardless of the restrictions imposed by the programming language. Thus, the programmer is free to present his program in a top-down fashion, bottom-up fashion, or whatever seems best in terms of promoting understanding and maintenance. Running \verb|weave| on the web file would produce a \TeX\ file, ready to be processed by \TeX\@@. The resulting document included a variety of automatically generated indices and cross-references that made it much easier to navigate the code. Additionally, all of the code sections were automatically prettyprinted, resulting in a quite impressive document. Knuth also wrote the programs for \TeX\ and {\small\sf METAFONT} entirely in \verb|WEB|, eventually publishing them in book form~\cite{Knuth:ct-b,Knuth:ct-d}. These are probably the largest programs ever published in a readable form. Inspired by Knuth's example, many people have experimented with \verb|WEB|\@@. Some people have even built web-like tools for their own favorite combinations of programming language and typesetting language. For example, \verb|CWEB|, Knuth's current system of choice, works with a combination of C (or C++) and \TeX~\cite{Levy:TB8-1-12}. Another system, FunnelWeb, is independent of any programming language and only mildly dependent on \TeX~\cite{Williams:FUM92}. Inspired by the versatility of FunnelWeb and by the daunting size of its documentation, I decided to write my own, very simple, tool for literate programming.% \footnote{There is another system similar to mine, written by Norman Ramsey, called {\em noweb}~\cite{Ramsey:LPT92}. It perhaps suffers from being overly Unix-dependent and requiring several programs to use. On the other hand, its command syntax is very nice. In any case, nuweb certainly owes its name and a number of features to his inspiration.} \section{Nuweb} Nuweb works with any programming language and \LaTeX~\cite{Lamport:LDP85}. I wanted to use \LaTeX\ because it supports a multi-level sectioning scheme and has facilities for drawing figures. I wanted to be able to work with arbitrary programming languages because my friends and I write programs in many languages (and sometimes combinations of several languages), {\em e.g.,} C, Fortran, C++, yacc, lex, Scheme, assembly, Postscript, and so forth. The need to support arbitrary programming languages has many consequences: \begin{description} \item[No prettyprinting] Both \verb|WEB| and \verb|CWEB| are able to prettyprint the code sections of their documents because they understand the language well enough to parse it. Since we want to use {\em any\/} language, we've got to abandon this feature. However, we do allow particular individual formulas or fragments of \LaTeX\ code to be formatted and still be parts of output files. Also, keywords in scraps can be surrounded by \verb|@@_| to have them be bold in the output. \item[No index of identifiers] Because \verb|WEB| knows about Pascal, it is able to construct an index of all the identifiers occurring in the code sections (filtering out keywords and the standard type identifiers). Unfortunately, this isn't as easy in our case. We don't know what an identifier looks like in each language and we certainly don't know all the keywords. (On the other hand, see the end of Section~\ref{minorcommands}) \end{description} Of course, we've got to have some compensation for our losses or the whole idea would be a waste. Here are the advantages I can see: \begin{description} \item[Simplicity] The majority of the commands in \verb|WEB| are concerned with control of the automatic prettyprinting. Since we don't prettyprint, many commands are eliminated. A further set of commands is subsumed by \LaTeX\ and may also be eliminated. As a result, our set of commands is reduced to only four members (explained in the next section). This simplicity is also reflected in the size of this tool, which is quite a bit smaller than the tools used with other approaches. \item[No prettyprinting] Everyone disagrees about how their code should look, so automatic formatting annoys many people. One approach is to provide ways to control the formatting. Our approach is simpler---we perform no automatic formatting and therefore allow the programmer complete control of code layout. We do allow individual scraps to be presented in either verbatim, math, or paragraph mode in the \TeX\ output. \item[Control] We also offer the programmer complete control of the layout of his output files (the files generated during tangling). Of course, this is essential for languages that are sensitive to layout; but it is also important in many practical situations, {\em e.g.,} debugging. \item[Speed] Since nuweb doesn't do too much, the nuweb tool runs quickly. I combine the functions of \verb|tangle| and \verb|weave| into a single program that performs both functions at once. \item[Page numbers] Inspired by the example of noweb, nuweb refers to all scraps by page number to simplify navigation. If there are multiple scraps on a page (say, page~17), they are distinguished by lower-case letters ({\em e.g.,} 17a, 17b, and so forth). \item[Multiple file output] The programmer may specify more than one output file in a single nuweb file. This is required when constructing programs in a combination of languages (say, Fortran and C)\@@. It's also an advantage when constructing very large programs that would require a lot of compile time. \end{description} This last point is very important. By allowing the creation of multiple output files, we avoid the need for monolithic programs. Thus we support the creation of very large programs by groups of people. A further reduction in compilation time is achieved by first writing each output file to a temporary location, then comparing the temporary file with the old version of the file. If there is no difference, the temporary file can be deleted. If the files differ, the old version is deleted and the temporary file renamed. This approach works well in combination with \verb|make| (or similar tools), since \verb|make| will avoid recompiling untouched output files. \subsection{Nuweb and HTML} In addition to producing {\LaTeX} source, nuweb can be used to generate HyperText Markup Language (HTML), the markup language used by the World Wide Web. HTML provides hypertext links. When an HTML document is viewed online, a user can navigate within the document by activating the links. The tools which generate HTML automatically produce hypertext links from a nuweb source. (Note that hyperlinks can be included in \LaTeX\ using the \verb|hyperref| package. This is now the preferred way of doing this and the HTML processing is not up to date.) \section{Writing Nuweb} The bulk of a nuweb file will be ordinary \LaTeX\@@. In fact, any {\LaTeX} file can serve as input to nuweb and will be simply copied through, unchanged, to the documentation file---unless a nuweb command is discovered. All nuweb commands begin with an ``at-sign'' (\verb|@@|). Therefore, a file without at-signs will be copied unchanged. Nuweb commands are used to specify {\em output files,} define {\em fragments,} and delimit {\em scraps}. These are the basic features of interest to the nuweb tool---all else is simply text to be copied to the documentation file. \subsection{The Major Commands} Files and fragments are defined with the following commands: \begin{description} \item[{\tt @@o} {\em file-name flags scrap\/}] Output a file. The file name is terminated by whitespace. \item[{\tt @@d} {\em fragment-name scrap\/}] Define a fragment. The fragment name is terminated by a return or the beginning of a scrap. \item[{\tt @@q} {\em fragment-name scrap\/}] Define a fragment. The fragment name is terminated by a return or the beginning of a scrap. This a quoted fragment. \end{description} A specific file may be specified several times, with each definition being written out, one after the other, in the order they appear. The definitions of fragments may be similarly specified piecemeal. A fragment name may have embedded parameters. The parameters are denoted by the sequence \texttt{@@'value@@'} where \texttt{value} is an uninterpreted string of characters (although the sequence \texttt{@@@@} denotes a single \texttt{@@} character). When a fragment name is used inside a scrap the parameters may be replaced by an argument which may be a different literal string, a fragment use, an embedded fragment or by a parameter use. The difference between a quoted fragment (\texttt{@@q}) and an ordinary one (\texttt{@@d}) is that inside a quoted fragment fragments are not expanded on output. Rather, they are formatted as uses of fragments so that the output file can itself be \texttt{nuweb} source. This allows you to create files containing fragments which can undergo further processing before the fragments are expanded, while also describing and documenting them in one place. You can have both quoted and unquoted fragments with the same name. They are written out in order as usual, with those introduced by \texttt{@@q} being quoted and those with \texttt{@@d} expanded as normal. In quoted fragments, the \texttt{@@f} filename is quoted as well, so that when it is expanded it refers to the finally produced file, not any of the intermediate ones. \subsubsection{Scraps} Scraps have specific begin markers and end markers to allow precise control over the contents and layout. Note that any amount of whitespace (including carriage returns) may appear between a name and the beginning of a scrap. Scraps may also appear in the running text where they are formatted (almost) identically to their use in definitions, but don't appear in any code output. \begin{description} \item[\tt @@\{{\em anything\/}@@\}] where the scrap body includes every character in {\em anything\/}---all the blanks, all the tabs, all the carriage returns. This scrap will be typeset in verbatim mode. Using the \texttt{-l} option will cause the program to typeset the scrap with the help of \LaTeX's \texttt{listings} package. \item[\tt @@[{\em anything\/}@@]] where the scrap body includes every character in {\em anything\/}---all the blanks, all the tabs, all the carriage returns. This scrap will be typeset in paragraph mode, allowing sections of \TeX\ documents to be scraps, but still be prettyprinted in the document. \item[\tt @@({\em anything\/}@@)] where the scrap body includes every character in {\em anything\/}---all the blanks, all the tabs, all the carriage returns. This scrap will be typeset in math mode. This allows this scrap to contain a formula which will be typeset nicely. \end{description} Inside a scrap, we may invoke a fragment. \begin{description} \item[\tt @@<{\em fragment-name\/}@@>] This is a fragment use. It causes the fragment {\em fragment-name\/} to be expanded inline as the code is written out to a file. It is an error to specify recursive fragment invocations. \item[\tt @@<{\em fragment-name\/}@@( {\em a1} @@, {\em a2} @@) @@>] This is the old form of parameterising a fragment. It causes the fragment {\em fragment-name\/} to be expanded inline with the arguments {\em a1}, {\em a2}, etc. Up to 9 arguments may be given. \item[\tt @@1, @@2, ..., @@9] In a fragment causes the n'th fragment argument to be substituted into the scrap. If the argument is not passed, a null string is substituted. Arguments can be passed in two ways, either embedded in the fragment name or as the old form given above. An embedded argument may specified in four ways. \begin{description} \item[\tt @@'string@@'] The string will be copied literally into the called fragment. It will not be interpreted (except for \texttt{@@@@} converted to \texttt{@@}). \item[\tt @@<{\em fragment-name\/}@@>] The fragment will be expanded in the usual fashion and the results passed to the called fragment. \item[\tt @@\{text@@\}] The text will be expanded as normal and the results passed to the called fragment. This behaves like an anonymous fragment which has the same arguments as the calling fragment. Its principle use is to combine text and arguments into one argument. \item[\tt @@1, @@2, ..., @@9] The argument of the calling fragment will be passed to the called fragment and expanded in there. \end{description} If an argument is used but there is no corresponding parameter in the fragment name, the null string is substituted. But what happens if there is a parameter in the full name of a fragment, but a particular application of the fragment is abbreviated (using the \verb|. . .| notation) and the argument is missed? In that case the argument is replaced by the string given in the definition of the fragment. In the old form the parameter may contain any text and will be expanded as a normal scrap. The two forms of parameter passing don't play nicely together. If a scrap passes both embedded and old form arguments the old form arguments are ignored. \item[\tt @@x{\em label}@@x] Marks a place inside a scrap and associates it to the label (which can be any text not containing a @@). Expands to the reference number of the scrap followed by a numeric value. Outside scraps it expands to the same value. It is used so that text outside scraps can refer to particular places within scraps. \item[\tt @@f] Inside a scrap this is replaced by the name of the current output file. \item[\tt @@t] Inside a scrap this is replaced by the title of the fragment as it is at the point it is used, with all parameters replaced by actual arguments. \item[\tt @@\#] At the beginning of a line in a scrap this will suppress the normal indentation for that line. Use this, for example, when you have a \verb|#ifdef| inside a nested scrap. Writing \verb|@@##ifdef| will cause it to be lined up on the left rather than indented with the rest of its code. \item[\tt @@s] Inside a scrap supresses indentation for the following fragment expansion. \end{description} Note that fragment names may be abbreviated, either during invocation or definition. For example, it would be very tedious to have to type, repeatedly, the fragment name \begin{quote} \verb|@@d Check for terminating at-sequence and return name if found| \end{quote} Therefore, we provide a mechanism (stolen from Knuth) of indicating abbreviated names. \begin{quote} \verb|@@d Check for terminating...| \end{quote} Basically, the programmer need only type enough characters to identify the fragment name uniquely, followed by three periods. An abbreviation may even occur before the full version; nuweb simply preserves the longest version of a fragment name. Note also that blanks and tabs are insignificant within a fragment name; each string of them is replaced by a single blank. Sometimes, for instance during program testing, it is convenient to comment out a few lines of code. In C or Fortran placing \verb|/* ... */| around the relevant code is not a robust solution, as the code itself may contain comments. Nuweb provides the command \begin{quote} \verb|@@%| \end{quote}only to be used inside scraps. It behaves exactly the same as \verb|%| in the normal {\LaTeX} text body. When scraps are written to a program file or a documentation file, tabs are expanded into spaces by default. Currently, I assume tab stops are set every eight characters. Furthermore, when a fragment is expanded in a scrap, the body of the fragment is indented to match the indentation of the fragment invocation. Therefore, care must be taken with languages ({\em e.g.,} Fortran) that are sensitive to indentation. These default behaviors may be changed for each output file (see below). \subsubsection{Flags} When defining an output file, the programmer has the option of using flags to control output of a particular file. The flags are intended to make life a little easier for programmers using certain languages. They introduce little language dependences; however, they do so only for a particular file. Thus it is still easy to mix languages within a single document. There are four ``per-file'' flags: \begin{description} \item[\tt -d] Forces the creation of \verb|#line| directives in the output file. These are useful with C (and sometimes C++ and Fortran) on many Unix systems since they cause the compiler's error messages to refer to the web file rather than to the output file. Similarly, they allow source debugging in terms of the web file. \item[\tt -i] Suppresses the indentation of fragments. That is, when a fragment is expanded within a scrap, it will {\em not\/} be indented to match the indentation of the fragment invocation. This flag would seem most useful for Fortran programmers. \item[\tt -t] Suppresses expansion of tabs in the output file. This feature seems important when generating \verb|make| files. \item[\tt -c{\it x}] Puts comments in generated code documenting the fragment that generated that code. The {\it x} selects the comment style for the language in use. So far the only valid values are \verb|c| to get \verb|C| comment style, \verb|+| for \verb|C++| and \verb|p| for \verb|Perl|. (\verb|Perl| commenting can be used for several languages including \verb|sh| and, mostly, \verb|tcl|.) If the global \verb|-x| cross-reference flag is set the comment includes the page reference for the first scrap that generated the code. \end{description} \subsection{The Minor Commands\label{minorcommands}} We have some very low-level utility commands that may appear anywhere in the web file. \begin{description} \item[\tt @@@@] Causes a single ``at sign'' to be copied into the output. \item[\tt @@\_] Causes the text between it and the next {\tt @@\_} to be made bold (for keywords, etc.) \item[\tt @@i {\em file-name\/}] Includes a file. Includes may be nested, though there is currently a limit of 10~levels. The file name should be complete (no extension will be appended) and should be terminated by a carriage return. Normally the current directory is searched for the file to be included, but this can be varied using the \verb|-I| flag on the command line. Each such flag adds one directory tothe search path and they are searched in the order given. \item[{\tt @@r}$x$] Changes the escape character `@@' to `$x$'. This must appear before any scrap definitions. \item[\tt @@v] Always replaced by the string established by the \texttt{-V} flag, or a default string if the flag isn't given. This is intended to mark versions of the generated files. \end{description} In the text of the document, that is outside scraps, you may include scrap-like material. \begin{description} \item[\tt @@\{\textit{Anything}@@\}] The included material, the \textit{Anything}, is typeset as if it appeared inside a scrap. This is useful for referring to fragments in the text and for describing the literate programming process itself. \item[\tt @@<\textit{Fragment name}@@>] The fragment named is expanded in place in the text. The expansion is presented verbatim, it is not interpretted for typesetting, so any special environment must be set up before and after this is used. \end{description} Finally, there are three commands used to create indices to the fragment names, file definitions, and user-specified identifiers. \begin{description} \item[\tt @@f] Create an index of file names. \item[\tt @@m] Create an index of fragment names. \item[\tt @@u] Create an index of user-specified identifiers. \end{description} I usually put these in their own section in the \LaTeX\ document; for example, see Chapter~\ref{indices}. Identifiers must be explicitly specified for inclusion in the \verb|@@u| index. By convention, each identifier is marked at the point of its definition; all references to each identifier (inside scraps) will be discovered automatically. To ``mark'' an identifier for inclusion in the index, we must mention it at the end of a scrap. For example, \begin{quote} \begin{verbatim} @@d a scrap @@{ Let's pretend we're declaring the variables FOO and BAR inside this scrap. @@| FOO BAR @@} \end{verbatim} \end{quote} I've used alphabetic identifiers in this example, but any string of characters (not including whitespace or \verb|@@| characters) will do. Therefore, it's possible to add index entries for things like \verb|<<=| if desired. An identifier may be declared in more than one scrap. In the generated index, each identifier appears with a list of all the scraps using and defining it, where the defining scraps are distinguished by underlining. Note that the identifier doesn't actually have to appear in the defining scrap; it just has to be in the list of definitions at the end of a scrap. \section{Sectioning commands} For larger documents the indexes and usage lists get rather unwieldy and problems arise in naming things so that different things in different parts of the document don't get confused. We have a sectioning command which keeps the fragment names and user identifiers separate. Thus, you can give a fragment in one section the same name as a fragment in another and the two won't be confused or connected in any way. Nor will user identifiers defined in one section be referenced in another. Except for the fact that scraps in successive sections can go into the same output file, this is the same as if the sections came from separate input files. However, occasionally you may really want fragments from one section to be used in another. More often, you will want to identify a user identifier in one section with the same identifier in another (as, for example, a header file defined in one section is included in code in another). Extra commands allow a fragment defined in one section to be accessible from all other sections. Similarly, you can have scraps which define user identifiers and export them so that they can be used in other sections. \begin{description} \item[{\tt @@s}] Start a new section. \item[{\tt @@S}] Close the current section and don't start another. \item[{\tt @@d+} fragment-name scrap] Define a fragment which is accessible in all sections, a global fragment. \item[{\tt @@D+}] Likewise \item[{\tt @@q+}] Likewise \item[{\tt @@Q+}] Likewise \item[{\tt @@m+}] Create an index of all such fragments. \item[{\tt @@u+}] Create an index of globally accessible user identifiers. \end{description} There are two kinds of section, the base section which is where normally everything goes, and local sections which are introduced with the {\tt @@s} command. A local section comprises everything from the command which starts it to the one which ends it. A {\tt @@s} command will start a new local section. A {\tt @@S} command closes the current local section, but doesn't open another, so what follows goes into the base section. Note that fragments defined in the base section aren't global; they are accessible only in the base section, but they are accessible regardless of any local sections between their definition and their use. Within a scrap: \begin{description} \item[\tt @@<+{\em fragment-name\/}@@>] Expand the globally accessible fragment with that name, rather than any local fragment. \item[{\tt @@+}] Like \verb"@@|" except that the identifiers defined are exported to the global realm and are not directly referenced in any scrap in any section (not even the one where they are defined). \item[{\tt @@-}] Like \verb"@@|" except that the identifiers are imported to the local realm. The cross-references show where the global variables are defined and defines the same names as locally accesible. Uses of the names within the section will point to this scrap. \end{description} Note that the \verb"+" signs above are part of the commands. They are not part of the fragment names. If you want a fragment whose name begins with a plus sign, leave a space between the command and the name. \section{Running Nuweb} Nuweb is invoked using the following command: \begin{quote} {\tt nuweb} {\em flags file-name}\ldots \end{quote} One or more files may be processed at a time. If a file name has no extension, \verb|.w| will be appended. {\LaTeX} suitable for translation into HTML by {\LaTeX}2HTML will be produced from files whose name ends with \verb|.hw|, otherwise, ordinary {\LaTeX} will be produced. While a file name may specify a file in another directory, the resulting documentation file will always be created in the current directory. For example, \begin{quote} {\tt nuweb /foo/bar/quux} \end{quote} will take as input the file \verb|/foo/bar/quux.w| and will create the file \verb|quux.tex| in the current directory. By default, nuweb performs both tangling and weaving at the same time. Normally, this is not a bottleneck in the compilation process; however, it's possible to achieve slightly faster throughput by avoiding one or another of the default functions using command-line flags. There are currently three possible flags: \begin{description} \item[\tt -t] Suppress generation of the documentation file. \item[\tt -o] Suppress generation of the output files. \item[\tt -c] Avoid testing output files for change before updating them. \end{description} Thus, the command \begin{quote} \verb|nuweb -to /foo/bar/quux| \end{quote} would simply scan the input and produce no output at all. There are several additional command-line flags: \begin{description} \item[\tt -v] For ``verbose'', causes nuweb to write information about its progress to \verb|stderr|. \item[\tt -n] Forces scraps to be numbered sequentially from~1 (instead of using page numbers). This form is perhaps more desirable for small webs. \item[\tt -s] Doesn't print list of scraps making up each file following each scrap. \item[\tt -d] Print ``dangling'' identifiers -- user identifiers which are never referenced, in indices, etc. \item[\tt -p {\it path}] Prepend \textit{path} to the filenames for all the output files. \item[\tt -l] \label{sec:pretty-print} Use the \texttt{listings} package for formatting scraps. Use this if you want to have a pretty-printer for your scraps. In order to e.g. have pretty Perl scraps, include the following \LaTeX\ commands in your document: \lstset{language=[LaTeX]TeX} \begin{lstlisting}{language={[LaTeX]TeX}} \usepackage{listings} ... \lstset{extendedchars=true,keepspaces=true,language=perl} \end{lstlisting} See the \texttt{listings} documentation for a list of formatting options. Be sure to include a \texttt{\char92 usepackage\{listings\}} in your document. \item[\tt -V \it string] Provide \textit{string} as the replacement for the @@v operation. \end{description} \section{Generating HTML} Nikos Drakos' {\LaTeX}2HTML Version 0.5.3~\cite{drakos:94} can be used to translate {\LaTeX} with embedded HTML scraps into HTML\@@. Be sure to include the document-style option \verb|html| so that {\LaTeX} will understand the hypertext commands. When translating into HTML, do not allow a document to be split by specifying ``\verb|-split 0|''. You need not generate navigation links, so also specify ``\verb|-no_navigation|''. While preparing a web, you may want to view the program's scraps without taking the time to run {\LaTeX}2HTML\@@. Simply rename the generated {\LaTeX} source so that its file name ends with \verb|.html|, and view that file. The documentations section will be jumbled, but the scraps will be clear. (Note that the HTML generation is not currently maintained. If the only reason you want HTML is ti get hyperlinks, use the {\LaTeX} \verb|hyperref| package and produce your document as PDF via \verb|pdflatex|.) \section{Restrictions} Because nuweb is intended to be a simple tool, I've established a few restrictions. Over time, some of these may be eliminated; others seem fundamental. \begin{itemize} \item The handling of errors is not completely ideal. In some cases, I simply warn of a problem and continue; in other cases I halt immediately. This behavior should be regularized. \item I warn about references to fragments that haven't been defined, but don't halt. The name of the fragment is included in the output file surrounded by \verb|<>| signs. This seems most convenient for development, but may change in the future. \item File names and index entries should not contain any \verb|@@| signs. \item Fragment names may be (almost) any well-formed \TeX\ string. It makes sense to change fonts or use math mode; however, care should be taken to ensure matching braces, brackets, and dollar signs. When producing HTML, fragments are displayed in a preformatted element (PRE), so fragments may contain one or more A, B, I, U, or P elements or data characters. \item Anything is allowed in the body of a scrap; however, very long scraps (horizontally or vertically) may not typeset well. \item Temporary files (created for comparison to the eventual output files) are placed in the current directory. Since they may be renamed to an output file name, all the output files should be on the same file system as the current directory. Alternatively, you can use the \verb|-p| flag to specify where the files go. \item Because page numbers cannot be determined until the document has been typeset, we have to rerun nuweb after \LaTeX\ to obtain a clean version of the document (very similar to the way we sometimes have to rerun \LaTeX\ to obtain an up-to-date table of contents after significant edits). Nuweb will warn (in most cases) when this needs to be done; in the remaining cases, \LaTeX\ will warn that labels may have changed. \end{itemize} Very long scraps may be allowed to break across a page if declared with \verb|@@O| or \verb|@@D| (instead of \verb|@@o| and \verb|@@d|). This doesn't work very well as a default, since far too many short scraps will be broken across pages; however, as a user-controlled option, it seems very useful. No distinction is made between the upper case and lower case forms of these commands when generating HTML\@@. \section{Acknowledgements} Several people have contributed their times, ideas, and debugging skills. In particular, I'd like to acknowledge the contributions of Osman Buyukisik, Manuel Carriba, Adrian Clarke, Tim Harvey, Michael Lewis, Walter Ravenek, Rob Shillingsburg, Kayvan Sylvan, Dominique de~Waleffe, and Scott Warren. Of course, most of these people would never have heard or nuweb (or many other tools) without the efforts of George Greenwade. Since maintenance has been taken over by Marc Mengel, Simon Wright and Keith Harwood online contributions have been made by: \begin{itemize} \item Walter Brown \verb|| \item Nicky van Foreest \verb|| \item Javier Goizueta \verb|| \item Alan Karp \verb|| \end{itemize} \ifshowcode \chapter{The Overall Structure} Processing a web requires three major steps: \begin{enumerate} \item Read the source, accumulating file names, fragment names, scraps, and lists of cross-references. \item Reread the source, copying out to the documentation file, with protection and cross-reference information for all the scraps. \item Traverse the list of files names. For each file name: \begin{enumerate} \item Dump all the defining scraps into a temporary file. \item If the file already exists and is unchanged, delete the temporary file; otherwise, rename the temporary file. \end{enumerate} \end{enumerate} \section{Files} I have divided the program into several files for quicker recompilation during development. @o global.h -cc -d @{@ @ @ @ @ @@} We'll need at least five of the standard system include files. @d Include files @{ #include #include #include #include #include #include @| FILE stderr exit fprintf fputs fopen fclose getc putc strlen toupper isupper islower isgraph isspace remove malloc size_t setlocale getenv @} \newpage \noindent I like to use \verb|TRUE| and \verb|FALSE| in my code. I'd use an \verb|enum| here, except that some systems seem to provide definitions of \verb|TRUE| and \verb|FALSE| by default. The following code seems to work on all the local systems. @d Type dec... @{#ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif @| FALSE TRUE @} Here we define the maximum length for names (of fragments etc). @d Limits @{@% #ifndef MAX_NAME_LEN #define MAX_NAME_LEN 1024 #endif @| MAX_NAME_LEN @} \subsection{The Main Files} The code is divided into four main files (introduced here) and five support files (introduced in the next section). The file \verb|main.c| will contain the driver for the whole program (see Section~\ref{main-routine}). @o main.c -cc -d @{#include "global.h" @} The first pass over the source file is contained in \verb|pass1.c|. It handles collection of all the file names, fragments names, and scraps (see Section~\ref{pass-one}). @o pass1.c -cc -d @{#include "global.h" @} The \verb|.tex| file is created during a second pass over the source file. The file \verb|latex.c| contains the code controlling the construction of the \verb|.tex| file (see Section~\ref{latex-file}). @o latex.c -cc -d @{#include "global.h" static int scraps = 1; @} The file \verb|html.c| contains the code controlling the construction of the \verb|.tex| file appropriate for use with {\LaTeX}2HTML (see Section~\ref{html-file}). @o html.c @{#include "global.h" static int scraps = 1; @} The code controlling the creation of the output files is in \verb|output.c| (see Section~\ref{output-files}). @o output.c -cc -d @{#include "global.h" @} \newpage \subsection{Support Files} The support files contain a variety of support routines used to define and manipulate the major data abstractions. The file \verb|input.c| holds all the routines used for referring to source files (see Section~\ref{source-files}). @o input.c -cc -d @{#include "global.h" @} Creation and lookup of scraps is handled by routines in \verb|scraps.c| (see Section~\ref{scraps}). @o scraps.c -cc -d @{#include "global.h" @} The handling of file names and fragment names is detailed in \verb|names.c| (see Section~\ref{names}). @o names.c -cc -d @{#include "global.h" @} Memory allocation and deallocation is handled by routines in \verb|arena.c| (see Section~\ref{memory-management}). @o arena.c -cc -d @{#include "global.h" @} Finally, for best portability, I seem to need a file containing (useless!) definitions of all the global variables. @o global.c -cc -d @{#include "global.h" @ @ @} \section{The Main Routine} \label{main-routine} The main routine is quite simple in structure. It wades through the optional command-line arguments, then handles any files listed on the command line. @o main.c -cc -d @{ #include int main(int argc, char** argv) { int arg = 1; @ @ initialise_delimit_scrap_array(); @ exit(0); } @| main @} We only have two major operating system dependencies; the separators for file names, and how to set environment variables. For now we assume the latter can be accomplished via \verb|putenv()| in \verb|stdlib.h|. @d Operating System Dependencies @{ #if defined(VMS) #define PATH_SEP(c) (c==']'||c==':') #define PATH_SEP_CHAR "" #define DEFAULT_PATH "" #elif defined(MSDOS) #define PATH_SEP(c) (c=='\\') #define PATH_SEP_CHAR "\\" #define DEFAULT_PATH "." #else #define PATH_SEP(c) (c=='/') #define PATH_SEP_CHAR "/" #define DEFAULT_PATH "." #endif @} \subsection{Command-Line Arguments} There are numerous possible command-line arguments: \begin{description} \item[\tt -t] Suppresses generation of the {\tt .tex} file. \item[\tt -o] Suppresses generation of the output files. \item[\tt -d] list dangling identifier references in indexes. \item[\tt -c] Forces output files to overwrite old files of the same name without comparing for equality first. \item[\tt -v] The verbose flag. Forces output of progress reports. \item[\tt -n] Forces sequential numbering of scraps (instead of page numbers). \item[\tt -s] Doesn't print list of scraps making up file at end of each scrap. \item[\tt -x] Include cross-reference numbers in scrap comments. \item[\tt -p \it path] Prepend \textit{path} to the filenames for all the output files. \item[\tt -h \it options] Provide options for the hyperref package. \item[\tt -r] Turn on hyperlinks. You must include the |usepackage| options in the text. \end{description} There are two ways to get hyper-links into your documentation. With one, the \texttt{-r} flag simply arranges for fragment cross-references to be hyperlinks. You have to provide all the rest of the \texttt{hyperref} material yourself. The \texttt{-h} flag does more work, but requires you provide the \texttt{hyperref} options on the command line. You would use the first if you know you will use the same tools all the time. You would use the second if you have to distribute the nuweb file and use the build system to provide the \texttt{hyperref} options. The \texttt{-h}\ option is used as follows. If you want hyperlinks in your document give a string to this option which is a valid option for the \texttt{hyperref} package. For example, \verb|"dvips,colorlinks=true"| sets it to use the \texttt{dvips} driver and mark the links in colour. For more information about these options, see the \texttt{hyperref}\ documentation. Then in your document you put the command \verb|\NWuseHyperlinks| immadiately after your \verb|usepackage| commands. If you do both of these you will get hyperlinks. If you miss out either then nothing interesting happens. \noindent Global flags are declared for each of the arguments. @d Global variable dec... @{extern int tex_flag; /* if FALSE, don't emit the documentation file */ extern int html_flag; /* if TRUE, emit HTML instead of LaTeX scraps. */ extern int output_flag; /* if FALSE, don't emit the output files */ extern int compare_flag; /* if FALSE, overwrite without comparison */ extern int verbose_flag; /* if TRUE, write progress information */ extern int number_flag; /* if TRUE, use a sequential numbering scheme */ extern int scrap_flag; /* if FALSE, don't print list of scraps */ extern int dangling_flag; /* if FALSE, don't print dangling identifiers */ extern int xref_flag; /* If TRUE, print cross-references in scrap comments */ extern int prepend_flag; /* If TRUE, prepend a path to the output file names */ extern char * dirpath; /* The prepended directory path */ extern char * path_sep; /* How to join path to filename */ extern int listings_flag; /* if TRUE, use listings package for scrap formatting */ extern int version_info_flag; /* If TRUE, set up version string */ extern char * version_string; /* What to print for @@v */ extern int hyperref_flag; /* Are we preparing for hyperref package. */ extern int hyperopt_flag; /* Are we preparing for hyperref options */ extern char * hyperoptions; /* The options to pass to the hyperref package */ extern int includepath_flag; /* Do we have an include path? */ extern struct incl{char * name; struct incl * next;} * include_list; /* The list of include paths */ @| tex_flag html_flag output_flag compare_flag verbose_flag number_flag scrap_flag dangling_flag xref_flag version_info_flag version_string hyperref_flag hyperopt_flag hyperoptions includepath_flag incl @} The flags are all initialized for correct default behavior. @d Global variable def... @{int tex_flag = TRUE; int html_flag = FALSE; int output_flag = TRUE; int compare_flag = TRUE; int verbose_flag = FALSE; int number_flag = FALSE; int scrap_flag = TRUE; int dangling_flag = FALSE; int xref_flag = FALSE; int prepend_flag = FALSE; char * dirpath = DEFAULT_PATH; /* Default directory path */ char * path_sep = PATH_SEP_CHAR; int listings_flag = FALSE; int version_info_flag = FALSE; char default_version_string[] = "no version"; char * version_string = default_version_string; int hyperref_flag = FALSE; int hyperopt_flag = FALSE; char * hyperoptions = ""; int includepath_flag = FALSE; /* Do we have an include path? */ struct incl * include_list = NULL; /* The list of include paths */ @} A global variable \verb|nw_char| will be used for the nuweb meta-character, which by default will be @@. @d Global variable dec... @{extern int nw_char; @| nw_char @} @d Global variable def... @{int nw_char='@@'; @| nw_char @} We save the invocation name of the command in a global variable \verb|command_name| for use in error messages. @d Global variable dec... @{extern char *command_name; @| command_name @} @d Global variable def... @{char *command_name = NULL; @} The invocation name is conventionally passed in \verb|argv[0]|. @d Interpret com... @{command_name = argv[0]; @} We need to examine the remaining entries in \verb|argv|, looking for command-line arguments. @d Interpret com... @{while (arg < argc) { char *s = argv[arg]; if (*s++ == '-') { @ arg++; @ @ @ @ } else break; }@} Several flags can be stacked behind a single minus sign; therefore, we've got to loop through the string, handling them all. If this flag requires an argument we skip to getting its argument straight away. This allows arguments to butt up to their flags and also avoids ambiguity about which value goes with which flag. @d Interpret the... @{{ char c = *s++; while (c) { switch (c) { case 'c': compare_flag = FALSE; break; case 'd': dangling_flag = TRUE; break; case 'h': hyperopt_flag = TRUE; goto HasValue; case 'I': includepath_flag = TRUE; goto HasValue; case 'l': listings_flag = TRUE; break; case 'n': number_flag = TRUE; break; case 'o': output_flag = FALSE; break; case 'p': prepend_flag = TRUE; goto HasValue; case 'r': hyperref_flag = TRUE; break; case 's': scrap_flag = FALSE; break; case 't': tex_flag = FALSE; break; case 'v': verbose_flag = TRUE; break; case 'V': version_info_flag = TRUE; goto HasValue; case 'x': xref_flag = TRUE; break; default: fprintf(stderr, "%s: unexpected argument ignored. ", command_name); fprintf(stderr, "Usage is: %s [-cdnostvx] " "[-I path] [-V version] " "[-h options] [-p path] file...\n", command_name); break; } c = *s++; } @#HasValue:; }@} @d Perhaps get the prepend path @{if (prepend_flag) { if (*s == '\0') s = argv[arg++]; dirpath = s; prepend_flag = FALSE; } @} @d Perhaps add an include path @{if (includepath_flag) { struct incl * le = (struct incl *)arena_getmem(sizeof(struct incl)); struct incl ** p = &include_list; if (*s == '\0') s = argv[arg++]; le->name = save_string(s); le->next = NULL; while (*p != NULL) p = &((*p)->next); *p = le; includepath_flag = FALSE; } @} @d Perhaps get the version info string @{if (version_info_flag) { if (*s == '\0') s = argv[arg++]; version_string = s; version_info_flag = FALSE; } @} @d Perhaps get the hyperref options @{if (hyperopt_flag) { if (*s == '\0') s = argv[arg++]; hyperoptions = s; hyperopt_flag = FALSE; hyperref_flag = TRUE; } @} In order to be able to process files in foreign languages we set the locale information. \texttt{isgraph()} and friends need this (see Section~\ref{names}). Try to read \texttt{LC\_CTYPE} from the environment. Use \texttt{LC\_ALL} if that fails. Don't set the locale if reading \texttt{LC\_ALL} fails, too. Print a warning if setting the program's locale fails. @d Set locale information @{ { /* try to get locale information */ char *s=getenv("LC_CTYPE"); if (s==NULL) s=getenv("LC_ALL"); /* set it */ if (s!=NULL) if(setlocale(LC_CTYPE, s)==NULL) fprintf(stderr, "Setting locale failed\n"); } @} \subsection{File Names} We expect at least one file name. While a missing file name might be ignored without causing any problems, we take the opportunity to report the usage convention. @d Process the remaining... @{{ if (arg >= argc) { fprintf(stderr, "%s: expected a file name. ", command_name); fprintf(stderr, "Usage is: %s [-cnotv] [-p path] file-name...\n", command_name); exit(-1); } do { @ arg++; } while (arg < argc); }@} \newpage \noindent The code to handle a particular file name is rather more tedious than the actual processing of the file. A file name may be an arbitrarily complicated path name, with an optional extension. If no extension is present, we add \verb|.w| as a default. The extended path name will be kept in a local variable \verb|source_name|. The resulting documentation file will be written in the current directory; its name will be kept in the variable \verb|tex_name|. @d Handle the file... @{{ char source_name[FILENAME_MAX]; char tex_name[FILENAME_MAX]; char aux_name[FILENAME_MAX]; @ @ }@} I bump the pointer \verb|p| through all the characters in \verb|argv[arg]|, copying all the characters into \verb|source_name| (via the pointer \verb|q|). At each slash, I update \verb|trim| to point just past the slash in \verb|source_name|. The effect is that \verb|trim| will point at the file name without any leading directory specifications. The pointer \verb|dot| is made to point at the file name extension, if present. If there is no extension, we add \verb|.w| to the source name. In any case, we create the \verb|tex_name| from \verb|trim|, taking care to get the correct extension. The \verb|html_flag| is set in this scrap. @d Build \verb|sou... @{{ char *p = argv[arg]; char *q = source_name; char *trim = q; char *dot = NULL; char c = *p++; while (c) { *q++ = c; if (PATH_SEP(c)) { trim = q; dot = NULL; } else if (c == '.') dot = q - 1; c = *p++; } @ *q = '\0'; if (dot) { *dot = '\0'; /* produce HTML when the file extension is ".hw" */ html_flag = dot[1] == 'h' && dot[2] == 'w' && dot[3] == '\0'; sprintf(tex_name, "%s%s%s.tex", dirpath, path_sep, trim); sprintf(aux_name, "%s%s%s.aux", dirpath, path_sep, trim); *dot = '.'; } else { sprintf(tex_name, "%s%s%s.tex", dirpath, path_sep, trim); sprintf(aux_name, "%s%s%s.aux", dirpath, path_sep, trim); *q++ = '.'; *q++ = 'w'; *q = '\0'; } }@} If the source file has a directory part we add that directory to the end of the search path so that the path of last resort is the same as the source path, not, as one person put it, something irrelevant like the directory nuweb was invoked from. @d Add the source path to the include path list @{if (trim != source_name) { struct incl * le = (struct incl *)arena_getmem(sizeof(struct incl)); struct incl ** p = &include_list; char sv = *trim; *trim = '\0'; le->name = save_string(source_name); le->next = NULL; while (*p != NULL) p = &((*p)->next); *p = le; *trim = sv; } @} Now that we're finally ready to process a file, it's not really too complex. We bundle most of the work into four routines \verb|pass1| (see Section~\ref{pass-one}), \verb|write_tex| (see Section~\ref{latex-file}), \verb|write_html| (see Section~\ref{html-file}), and \verb|write_files| (see Section~\ref{output-files}). After we're finished with a particular file, we must remember to release its storage (see Section~\ref{memory-management}). The sequential numbering of scraps is forced when generating HTML. @d Process a file @{{ pass1(source_name); current_sector = 1; prev_sector = 1; if (tex_flag) { if (html_flag) { int saved_number_flag = number_flag; number_flag = TRUE; collect_numbers(aux_name); write_html(source_name, tex_name); number_flag = saved_number_flag; } else { collect_numbers(aux_name); write_tex(source_name, tex_name); } } if (output_flag) write_files(file_names); arena_free(); }@} \newpage \section{Pass One} \label{pass-one} During the first pass, we scan the file, recording the definitions of each fragment and file and accumulating all the scraps. @d Function pro... @{extern void pass1(char *file_name); @} The routine \verb|pass1| takes a single argument, the name of the source file. It opens the file, then initializes the scrap structures (see Section~\ref{scraps}) and the roots of the file-name tree, the fragment-name tree, and the tree of user-specified index entries (see Section~\ref{names}). After completing all the necessary preparation, we make a pass over the file, filling in all our data structures. Next, we seach all the scraps for references to the user-specified index entries. Finally, we must reverse all the cross-reference lists accumulated while scanning the scraps. @o pass1.c -cc -d @{void pass1(char *file_name) { if (verbose_flag) fprintf(stderr, "reading %s\n", file_name); source_open(file_name); init_scraps(); macro_names = NULL; file_names = NULL; user_names = NULL; @ if (tex_flag) search(); @ } @| pass1 @} The only thing we look for in the first pass are the command sequences. All ordinary text is skipped entirely. @d Scan the source file, look... @{{ int c = source_get(); while (c != EOF) { if (c == nw_char) @ c = source_get(); } }@} Only four of the at-sequences are interesting during the first pass. We skip past others immediately; warning if unexpected sequences are discovered. @d Scan at-sequence @{{ char quoted = 0; c = source_get(); switch (c) { case 'r': c = source_get(); nw_char = c; update_delimit_scrap(); break; case 'O': case 'o': @ break; case 'Q': case 'q': quoted = 1; case 'D': case 'd': @ break; case 's': @ break; case 'S': @ break; case '<': case '(': case '[': case '{': @ break; case 'c': @ break; case 'x': case 'v': case 'u': case 'm': case 'f': /* ignore during this pass */ break; default: if (c==nw_char) /* ignore during this pass */ break; fprintf(stderr, "%s: unexpected %c sequence ignored (%s, line %d)\n", command_name, nw_char, source_name, source_line); break; } }@} @d Step to next sector @{ prev_sector += 1; current_sector = prev_sector; c = source_get(); @} @d Close the current sector @{current_sector = 1; c = source_get(); @} @d Global variable declar... @{extern unsigned char current_sector; extern unsigned char prev_sector; @| current_sector prev_sector @} @d Global variable def... @{unsigned char current_sector = 1; unsigned char prev_sector = 1; @} \subsection{Accumulating Definitions} There are three steps required to handle a definition: \begin{enumerate} \item Build an entry for the name so we can look it up later. \item Collect the scrap and save it in the table of scraps. \item Attach the scrap to the name. \end{enumerate} We go through the same steps for both file names and fragment names. @d Build output file definition @{{ Name *name = collect_file_name(); /* returns a pointer to the name entry */ int scrap = collect_scrap(); /* returns an index to the scrap */ @ }@} @d Build fragment definition @{{ Name *name = collect_macro_name(); int scrap = collect_scrap(); @ }@} Since a file or fragment may be defined by many scraps, we maintain them in a simple linked list. The list is actually built in reverse order, with each new definition being added to the head of the list. @d Add \verb|scrap| to \verb|name|'s definition list @{{ Scrap_Node *def = (Scrap_Node *) arena_getmem(sizeof(Scrap_Node)); def->scrap = scrap; def->quoted = quoted; def->next = name->defs; name->defs = def; }@} @d Skip over an in-text scrap @{ { int c; int depth = 1; while ((c = source_get()) != EOF) { if (c == nw_char) @ } fprintf(stderr, "%s: unexpected EOF in text at (%s, %d)\n", command_name, source_name, source_line); exit(-1); skipped: ; } @} @d Skip over at-sign or go to skipped @{ { c = source_get(); switch (c) { case '{': case '[': case '(': case '<': depth += 1; break; case '}': case ']': case ')': case '>': if (--depth == 0) goto skipped; case 'x': case '|': case ',': case '%': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case '_': case 'f': case '#': case '+': case '-': case 'v': case '*': case 'c': case '\'': case 's': break; default: if (c != nw_char) { fprintf(stderr, "%s: unexpected %c%c in text at (%s, %d)\n", command_name, nw_char, c, source_name, source_line); exit(-1); } break; } } @} \subsection{Block Comments} @d Collect a block comment @{{ char * p = blockBuff; char * e = blockBuff + (sizeof(blockBuff)/sizeof(blockBuff[0])) - 1; @ while (p < e) { @ } if (p == e) { @ } *p = '\000'; } @} @d Skip whitespace @{while (source_peek == ' ' || source_peek == '\t' || source_peek == '\n') (void)source_get(); @} @d Add one char to the block buffer @{int c = source_get(); if (c == nw_char) { @ } else if (c == EOF) { source_ungetc(&c); break; } else { @ } @} @d Add an at character to the block or break @{int cc = source_peek; if (cc == 'c') { do c = source_get(); while (c <= ' '); break; } else if (cc == 'd' || cc == 'D' || cc == 'q' || cc == 'Q' || cc == 'o' || cc == 'O' || cc == EOF) { source_ungetc(&c); break; } else { *p++ = c; *p++ = source_get(); } @} @d Add any other character to the block @{ @ *p++ = c; @} @d Perhaps skip white-space @{if (c == ' ') { while (source_peek == ' ') c = source_get(); } if (c == '\n') { if (source_peek == '\n') { do c = source_get(); while (source_peek == '\n'); } else c = ' '; } @} @d Skip to the next nw-char @{int c; while ((c = source_get()), c != nw_char && c != EOF)/* Skip */ source_ungetc(&c);@} @d Global variable declar... @{extern char blockBuff[6400]; @| blockBuff @} @d Global variable def... @{char blockBuff[6400]; @} We don't show block comments inside scraps in the \TeX\ file, but we do show where they go. @d Show presence of a block comment @{{ fputs(delimit_scrap[scrap_type][1],file); fprintf(file, "\\hbox{\\sffamily\\slshape (Comment)}"); fputs(delimit_scrap[scrap_type][0], file); } @} The block comment is presently in the block buffer. Here we copy it into the scrap whence we shall copy it into the output file. @d Include block comment in a scrap @{{ char * p = blockBuff; push(nw_char, &writer); do { push(c, &writer); c = *p++; } while (c != '\0'); }@} @d Copy block comment from scrap @{{ int bgn = indent + global_indent; int posn = bgn + strlen(comment_begin[comment_flag]); int i; @ c = pop(&reader); fputs(comment_begin[comment_flag], file); while (c != '\0') { @ @ { putc('\n', file); for (i = 0; i < bgn ; i++) putc(' ', file); c = pop(&reader); if (c != '\0') { posn = bgn + strlen(comment_mid[comment_flag]); fputs(comment_mid[comment_flag], file); } } } fputs(comment_end[comment_flag], file); } @} @d Move a word to the file @{do { putc(c, file); posn += 1; c = pop(&reader); } while (c > ' '); @} @d If we break the line at this word @{if (c == '\n' || (c == ' ' && posn > 60))@} @d Skip over a block comment @{if (last == nw_char && c == 'c') while ((c = pop(reader.m)) != '\0') /* Skip */; @} @d Begin or end a block comment @{if (inBlock) { @ } else { @ }@} \subsection{Fixing the Cross References} Since the definition and reference lists for each name are accumulated in reverse order, we take the time at the end of \verb|pass1| to reverse them all so they'll be simpler to print out prettily. The code for \verb|reverse_lists| appears in Section~\ref{names}. @d Reverse cross-reference lists @{{ reverse_lists(file_names); reverse_lists(macro_names); reverse_lists(user_names); }@} \subsection{Dealing with fragment parameters} Fragment parameters were added on later in nuweb's development. There still is not, for example, an index of fragment parameters. We need a data type to keep track of fragment parameters. @d Type decl... @{typedef int *Parameters; @| Parameters @} When we are copying a scrap to the output, we check for an embedded parameter. If we find it we copy it. Otherwise it's an old-style parameter and we can then pull the $n$th string from the \verb|Parameters| list when we see an \verb|@@1| \verb|@@2|, etc. @D Handle macro parameter substitution @{ case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': { Arglist * args; Name * name; lookup(c - '1', inArgs, inParams, &name, &args); if (name == (Name *)1) { Embed_Node * q = (Embed_Node *)args; indent = write_scraps(file, spelling, q->defs, global_indent + indent, indent_chars, debug_flag, tab_flag, indent_flag, 0, q->args, inParams, local_parameters, ""); } else if (name != NULL) { int i, narg; char * p = name->spelling; Arglist *q = args; @ indent = write_scraps(file, spelling, name->defs, global_indent + indent, indent_chars, debug_flag, tab_flag, indent_flag, comment_flag, args, name->arg, local_parameters, p); } else if (args != NULL) { if (delayed_indent) { @ } fputs((char *)args, file); } else if ( parameters && parameters[c - '1'] ) { Scrap_Node param_defs; param_defs.scrap = parameters[c - '1']; param_defs.next = 0; write_scraps(file, spelling, ¶m_defs, global_indent + indent, indent_chars, debug_flag, tab_flag, indent_flag, comment_flag, NULL, NULL, 0, ""); } else if (delayed_indent) { @ } } @} Now onto actually parsing fragment parameters from a call. We start off checking for fragment parameters, an \verb|@@(| sequence followed by parameters separated by \verb|@@,| sequences, and terminated by a \verb|@@)| sequence. We collect separate scraps for each parameter, and write the scrap numbers down in the text. For example, if the file has: \begin{verbatim} @@ \end{verbatim} we actually make new scraps, say 10 and 11, for param1 and param2, and write in the collected scrap: \begin{verbatim} @@ \end{verbatim} @d Save macro parameters @{{ int param_scrap; char param_buf[10]; push(nw_char, &writer); push('(', &writer); do { param_scrap = collect_scrap(); sprintf(param_buf, "%d", param_scrap); pushs(param_buf, &writer); push(nw_char, &writer); push(scrap_ended_with, &writer); add_to_use(name, current_scrap); } while( scrap_ended_with == ',' ); do c = source_get(); while( ' ' == c ); if (c == nw_char) { c = source_get(); } if (c != '>') { /* ZZZ print error */; } }@} If we get inside, we have at least one parameter, which will be at the beginning of the parms buffer, and we prime the pump with the first character. @d Check for macro parameters @{ if (c == '(') { Parameters res = arena_getmem(10 * sizeof(int)); int *p2 = res; int count = 0; int scrapnum; while( c && c != ')' ) { scrapnum = 0; c = pop(manager); while( '0' <= c && c <= '9' ) { scrapnum = scrapnum * 10 + c - '0'; c = pop(manager); } if ( c == nw_char ) { c = pop(manager); } *p2++ = scrapnum; } while (count < 10) { *p2++ = 0; count++; } while( c && c != nw_char ) { c = pop(manager); } if ( c == nw_char ) { c = pop(manager); } *parameters = res; } @} These are used in \verb|write_tex| and \verb|write_html| to output the argument list for a fragment. @d Format macro parameters @{ char sep; sep = '('; do { fputc(sep,file); fputs("{\\footnotesize ", file); write_single_scrap_ref(file, scraps + 1); fprintf(file, "\\label{scrap%d}\n", scraps + 1); fputs(" }", file); source_last = '{'; copy_scrap(file, TRUE, NULL); ++scraps; sep = ','; } while ( source_last != ')' && source_last != EOF ); fputs(" ) ",file); do c = source_get(); while(c != nw_char && c != EOF); if (c == nw_char) { c = source_get(); } @} @d Format HTML macro parameters @{ char sep; sep = '('; fputs("\\begin{rawhtml}", file); do { fputc(sep,file); fprintf(file, "%d ", scraps, scraps); source_last = '{'; copy_scrap(file, TRUE); ++scraps; sep = ','; } while ( source_last != ')' && source_last != EOF ); fputs(" ) ",file); do c = source_get(); while(c != nw_char && c != EOF); if (c == nw_char) { c = source_get(); } fputs("\\end{rawhtml}", file); @} \section{Writing the Latex File} \label{latex-file} The second pass (invoked via a call to \verb|write_tex|) copies most of the text from the source file straight into a \verb|.tex| file. Definitions are formatted slightly and cross-reference information is printed out. Note that all the formatting is handled in this section. If you don't like the format of definitions or indices or whatever, it'll be in this section somewhere. Similarly, if someone wanted to modify nuweb to work with a different typesetting system, this would be the place to look. @d Function... @{extern void write_tex(char *file_name, char *tex_name); @} We need a few local function declarations before we get into the body of \verb|write_tex|. @o latex.c -cc -d @{static void copy_scrap(FILE *file, int prefix, Name *name); /* formats the body of a scrap */ static void print_scrap_numbers(FILE *tex_file, Scrap_Node *scraps); /* formats a list of scrap numbers */ static void format_entry(Name *name, FILE *tex_file, unsigned char sector); /* formats an index entry */ static void format_file_entry(Name *name, FILE *tex_file); /* formats a file index entry */ static void format_user_entry(Name *name, FILE *tex_file, unsigned char sector); static void write_arg(FILE *tex_file, char *p); static void write_literal(FILE *tex_file, char *p, int mode); static void write_ArglistElement(FILE *file, Arglist *args, char **params); @} The routine \verb|write_tex| takes two file names as parameters: the name of the web source file and the name of the \verb|.tex| output file. @o latex.c -cc -d @{void write_tex(char *file_name, char *tex_name) { FILE *tex_file = fopen(tex_name, "w"); if (tex_file) { if (verbose_flag) fprintf(stderr, "writing %s\n", tex_name); source_open(file_name); @ @ fclose(tex_file); } else fprintf(stderr, "%s: can't open %s\n", command_name, tex_name); } @| write_tex @} Now that the \verb|\NW...| macros are used, it seems convenient to write default definitions for those macros so that source files need not define anything new. If a user wants to change any of the macros (to use hyperref or to write in some language other than english) he or she can redefine the commands. @d Write LaTeX limbo definitions @{if (hyperref_flag) { fputs("\\newcommand{\\NWtarget}[2]{\\hypertarget{#1}{#2}}\n", tex_file); fputs("\\newcommand{\\NWlink}[2]{\\hyperlink{#1}{#2}}\n", tex_file); } else { fputs("\\newcommand{\\NWtarget}[2]{#2}\n", tex_file); fputs("\\newcommand{\\NWlink}[2]{#2}\n", tex_file); } fputs("\\newcommand{\\NWtxtMacroDefBy}{Fragment defined by}\n", tex_file); fputs("\\newcommand{\\NWtxtMacroRefIn}{Fragment referenced in}\n", tex_file); fputs("\\newcommand{\\NWtxtMacroNoRef}{Fragment never referenced}\n", tex_file); fputs("\\newcommand{\\NWtxtDefBy}{Defined by}\n", tex_file); fputs("\\newcommand{\\NWtxtRefIn}{Referenced in}\n", tex_file); fputs("\\newcommand{\\NWtxtNoRef}{Not referenced}\n", tex_file); fputs("\\newcommand{\\NWtxtFileDefBy}{File defined by}\n", tex_file); fputs("\\newcommand{\\NWtxtIdentsUsed}{Uses:}\n", tex_file); fputs("\\newcommand{\\NWtxtIdentsNotUsed}{Never used}\n", tex_file); fputs("\\newcommand{\\NWtxtIdentsDefed}{Defines:}\n", tex_file); fputs("\\newcommand{\\NWsep}{${\\diamond}$}\n", tex_file); fputs("\\newcommand{\\NWnotglobal}{(not defined globally)}\n", tex_file); fputs("\\newcommand{\\NWuseHyperlinks}{", tex_file); if (hyperoptions[0] != '\0') { @ } fputs("}\n", tex_file); @} @d Write the hyperlink usage macro @{fprintf(tex_file, "\\usepackage[%s]{hyperref}", hyperoptions);@} We make our second (and final) pass through the source web, this time copying characters straight into the \verb|.tex| file. However, we keep an eye peeled for \verb|@@|~characters, which signal a command sequence. @d Copy \verb|source_file| into \verb|tex_file| @{{ int inBlock = FALSE; int c = source_get(); while (c != EOF) { if (c == nw_char) { @ } else { putc(c, tex_file); c = source_get(); } } }@} @d Interpret at-sequence @{{ int big_definition = FALSE; c = source_get(); switch (c) { case 'r': c = source_get(); nw_char = c; update_delimit_scrap(); break; case 'O': big_definition = TRUE; case 'o': @ break; case 'Q': case 'D': big_definition = TRUE; case 'q': case 'd': @ break; case 's': @ break; case 'S': @ break; case '{': case '[': case '(': @ break; case '<': @ break; case 'x': @ c = source_get(); break; case 'c': @ c = source_get(); break; case 'f': @ break; case 'm': @ break; case 'u': @ break; case 'v': @ break; default: if (c==nw_char) putc(c, tex_file); c = source_get(); break; } }@} @d Copy version info into tex file @{fputs(version_string, tex_file); c = source_get(); @} @d Expand macro into @'file@' @{{ Parameters local_parameters = 0; int changed; char indent_chars[MAX_INDENT]; Arglist *a; Name *name; Arglist * args; a = collect_scrap_name(0); name = a->name; args = instance(a->args, NULL, NULL, &changed); name->mark = TRUE; write_scraps(@1, tex_name, name->defs, 0, indent_chars, 0, 0, 1, 0, args, name->arg, local_parameters, tex_name); name->mark = FALSE; c = source_get(); } @} \subsection{Formatting Definitions} We go through a fair amount of effort to format a file definition. I've derived most of the \LaTeX\ commands experimentally; it's quite likely that an expert could do a better job. The \LaTeX\ for the previous fragment definition should look like this (perhaps modulo the scrap references): {\small \begin{verbatim} \begin{flushleft} \small \begin{minipage}{\linewidth}\label{scrap70}\raggedright\small $\langle$Interpret at-sequence {\footnotesize 18}$\rangle\equiv$ \vspace{-1ex} \begin{list}{}{} \item \mbox{}\verb@@{@@\\ \mbox{}\verb@@ int big_definition = FALSE;@@\\ \mbox{}\verb@@ c = source_get();@@\\ \mbox{}\verb@@ switch (c) {@@\\ \mbox{}\verb@@ case 'O': big_definition = TRUE;@@\\ \mbox{}\verb@@ case 'o': @@$\langle$Write output file definition {\footnotesize 19a}$\rangle$\verb@@@@\\ \end{verbatim} \vdots \begin{verbatim} \mbox{}\verb@@ case '@@{\tt @@}\verb@@': putc(c, tex_file);@@\\ \mbox{}\verb@@ default: c = source_get();@@\\ \mbox{}\verb@@ break;@@\\ \mbox{}\verb@@ }@@\\ \mbox{}\verb@@}@@$\diamond$ \end{list} \vspace{-1ex} \footnotesize\addtolength{\baselineskip}{-1ex} \begin{list}{}{\setlength{\itemsep}{-\parsep}\setlength{\itemindent}{-\leftmargin}} \item Fragment referenced in scrap 17b. \end{list} \end{minipage}\vspace{4ex} \end{flushleft} \end{verbatim}} \noindent The {\em flushleft\/} environment is used to avoid \LaTeX\ warnings about underful lines. The {\em minipage\/} environment is used to avoid page breaks in the middle of scraps. The {\em verb\/} command allows arbitrary characters to be printed (however, note the special handling of the \verb|@@| case in the switch statement). Fragment and file definitions are formatted nearly identically. I've factored the common parts out into separate scraps. @d Write output file definition @{{ Name *name = collect_file_name(); @ fputs("\\NWtarget{nuweb", tex_file); write_single_scrap_ref(tex_file, scraps); fputs("}{} ", tex_file); fprintf(tex_file, "\\verb%c\"%s\"%c\\nobreak\\ {\\footnotesize {", nw_char, name->spelling, nw_char); write_single_scrap_ref(tex_file, scraps); fputs("}}$\\equiv$\n", tex_file); @ @ if ( scrap_flag ) { @ } format_defs_refs(tex_file, scraps); format_uses_refs(tex_file, scraps++); @ @ }@} I don't format a fragment name at all specially, figuring the programmer might want to use italics or bold face in the midst of the name. @d Write macro definition @{{ Name *name = collect_macro_name(); @ fputs("\\NWtarget{nuweb", tex_file); write_single_scrap_ref(tex_file, scraps); fputs("}{} $\\langle\\,${\\itshape ", tex_file); @ fputs("}\\nobreak\\ {\\footnotesize {", tex_file); write_single_scrap_ref(tex_file, scraps); fputs("}}$\\,\\rangle\\equiv$\n", tex_file); @ @ @ @ format_defs_refs(tex_file, scraps); format_uses_refs(tex_file, scraps++); @ @ }@} @d Write the macro's name @{{ char * p = name->spelling; int i = 0; while (*p != '\000') { if (*p == ARG_CHR) { write_arg(tex_file, name->arg[i++]); p++; } else fputc(*p++, tex_file); } }@} @o latex.c -cc -d @{static void write_arg(FILE * tex_file, char * p) { fputs("\\hbox{\\slshape\\sffamily ", tex_file); while (*p) { switch (*p) { case '$': case '_': case '^': case '#': fputc('\\', tex_file); break; default: break; } fputc(*p, tex_file); p++; } fputs("\\/}", tex_file); } @| write_arg @} @d Begin the scrap environment @{{ if (big_definition) { if (inBlock) { @ } fputs("\\begin{flushleft} \\small", tex_file); } else { if (inBlock) { @ } else { @ } } fprintf(tex_file, "\\label{scrap%d}\\raggedright\\small\n", scraps); }@} @d Start block @{fputs("\\begin{flushleft} \\small\n\\begin{minipage}{\\linewidth}", tex_file); inBlock = TRUE;@} @d Switch block @{fputs("\\par\\vspace{\\baselineskip}\n", tex_file);@} @d End block @{fputs("\\end{minipage}\\vspace{4ex}\n", tex_file); fputs("\\end{flushleft}\n", tex_file); inBlock = FALSE;@} The interesting things here are the $\diamond$ inserted at the end of each scrap and the various spacing commands. The diamond helps to clearly indicate the end of a scrap. The spacing commands were derived empirically; they may be adjusted to taste. @d Fill in the middle of the scrap environment @{{ fputs("\\vspace{-1ex}\n\\begin{list}{}{} \\item\n", tex_file); extra_scraps = 0; copy_scrap(tex_file, TRUE, name); fputs("{\\NWsep}\n\\end{list}\n", tex_file); }@} \newpage \noindent We've got one last spacing command, controlling the amount of white space after a scrap. Note also the whitespace eater. I use it to remove any blank lines that appear after a scrap in the source file. This way, text following a scrap will not be indented. Again, this is a matter of personal taste. @d Finish the scrap environment @{{ scraps += extra_scraps; if (big_definition) fputs("\\vspace{4ex}\n\\end{flushleft}\n", tex_file); else { @ } do c = source_get(); while (isspace(c)); }@} @d Global variable declar... @{extern int extra_scraps; @| extra_scraps @} @d Global variable def... @{int extra_scraps = 0; @} @d Write in-text scrap @{copy_scrap(tex_file, FALSE, NULL); c = source_get(); @} \subsubsection{Formatting Cross References} @d Begin the cross-reference environment @{{ fputs("\\vspace{-1.5ex}\n", tex_file); fputs("\\footnotesize\n", tex_file); fputs("\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}", tex_file); fputs("\\setlength{\\itemindent}{-\\leftmargin}}\n", tex_file);} @} There may (unusually) be nothing to output in the cross-reference section, so output an empty list item anyway to avoid having an empty list. @d Finish the cross-reference environment @{{ fputs("\n\\item{}", tex_file); fputs("\n\\end{list}\n", tex_file); }@} @d Write file defs @{{ if (name->defs) { if (name->defs->next) { fputs("\\item \\NWtxtFileDefBy\\ ", tex_file); print_scrap_numbers(tex_file, name->defs); } } else { fprintf(stderr, "would have crashed in 'Write file defs' for '%s'\n", name->spelling); } }@} @d Write macro defs @{{ if (name->defs->next) { fputs("\\item \\NWtxtMacroDefBy\\ ", tex_file); print_scrap_numbers(tex_file, name->defs); } }@} @d Write macro refs @{{ if (name->uses) { if (name->uses->next) { fputs("\\item \\NWtxtMacroRefIn\\ ", tex_file); print_scrap_numbers(tex_file, name->uses); } else { fputs("\\item \\NWtxtMacroRefIn\\ ", tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, name->uses->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, name->uses->scrap); fputs("}", tex_file); fputs(".\n", tex_file); } } else { fputs("\\item {\\NWtxtMacroNoRef}.\n", tex_file); fprintf(stderr, "%s: <%s> never referenced.\n", command_name, name->spelling); } }@} @o latex.c -cc -d @{static void print_scrap_numbers(FILE *tex_file, Scrap_Node *scraps) { int page; fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, scraps->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, scraps->scrap, TRUE, &page); fputs("}", tex_file); scraps = scraps->next; while (scraps) { fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, scraps->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, scraps->scrap, FALSE, &page); scraps = scraps->next; fputs("}", tex_file); } fputs(".\n", tex_file); } @| print_scrap_numbers @} \subsubsection{Formatting a Scrap} We add a \verb|\mbox{}| at the beginning of each line to avoid problems with older versions of \TeX. This is the only place we really care whether a scrap is delimited with \verb|@@{...@@}|, \verb|@@[...@@]|, or \verb|@@(...@@)|, and we base our output sequences on that. We have an array \texttt{delimit\_scrap} where we store our strings that perform the formatting for one line of a scrap. Upon initialisation we copy our strings from the source \texttt{orig\_delimit\_scrap} to it to have the strings writeable. We might change the strings later when we encounter the command to change the `nuweb character'. @o latex.c @{static char *orig_delimit_scrap[3][5] = { /* {} mode: begin, end, insert nw_char, prefix, suffix */ { "\\verb@@", "@@", "@@{\\tt @@}\\verb@@", "\\mbox{}", "\\\\" }, /* [] mode: begin, end, insert nw_char, prefix, suffix */ { "", "", "@@", "", "" }, /* () mode: begin, end, insert nw_char, prefix, suffix */ { "$", "$", "@@", "", "" }, }; static char *delimit_scrap[3][5]; @} The function \texttt{initialise\_delimit\_scrap\_array} does the copying. If we want to have the listings package \index{listings package} do the formatting we have to replace only two of those strings: the \texttt{verb} command has to be replaced by the package's \texttt{lstinline} command. @o latex.c @{void initialise_delimit_scrap_array() { int i,j; for(i = 0; i < 3; i++) { for(j = 0; j < 5; j++) { if((delimit_scrap[i][j] = strdup(orig_delimit_scrap[i][j])) == NULL) { fprintf(stderr, "Not enough memory for string allocation\n"); exit(EXIT_FAILURE); } } } /* replace verb by lstinline */ if (listings_flag) { free(delimit_scrap[0][0]); if((delimit_scrap[0][0]=strdup("\\lstinline@@")) == NULL) { fprintf(stderr, "Not enough memory for string allocation\n"); exit(EXIT_FAILURE); } free(delimit_scrap[0][2]); if((delimit_scrap[0][2]=strdup("@@{\\tt @@}\\lstinline@@")) == NULL) { fprintf(stderr, "Not enough memory for string allocation\n"); exit(EXIT_FAILURE); } } } @} @d Function prototypes @{void initialise_delimit_scrap_array(void); @} @o latex.c @{int scrap_type = 0; @| scrap_type @} @o latex.c -cc -d @{static void write_literal(FILE * tex_file, char * p, int mode) { fputs(delimit_scrap[mode][0], tex_file); while (*p!= '\000') { if (*p == nw_char) fputs(delimit_scrap[mode][2], tex_file); else fputc(*p, tex_file); p++; } fputs(delimit_scrap[mode][1], tex_file); } @| write_literal @} @o latex.c -cc -d @{static void copy_scrap(FILE *file, int prefix, Name *name) { int indent = 0; int c; char ** params = name->arg; if (source_last == '{') scrap_type = 0; if (source_last == '[') scrap_type = 1; if (source_last == '(') scrap_type = 2; c = source_get(); if (prefix) fputs(delimit_scrap[scrap_type][3], file); fputs(delimit_scrap[scrap_type][0], file); while (1) { switch (c) { case '\n': fputs(delimit_scrap[scrap_type][1], file); if (prefix) fputs(delimit_scrap[scrap_type][4], file); fputs("\n", file); if (prefix) fputs(delimit_scrap[scrap_type][3], file); fputs(delimit_scrap[scrap_type][0], file); indent = 0; break; case '\t': @ break; default: if (c==nw_char) { @ break; } putc(c, file); indent++; break; } c = source_get(); } } @| copy_scrap scrap_type @} When we encounter the command to change the `nuweb character' we call this function. It updates the scrap formatting directives accordingly. @o latex.c @{void update_delimit_scrap() { /* {}-mode begin */ if (listings_flag) { delimit_scrap[0][0][10] = nw_char; } else { delimit_scrap[0][0][5] = nw_char; } /* {}-mode end */ delimit_scrap[0][1][0] = nw_char; /* {}-mode insert nw_char */ delimit_scrap[0][2][0] = nw_char; delimit_scrap[0][2][6] = nw_char; if (listings_flag) { delimit_scrap[0][2][18] = nw_char; } else { delimit_scrap[0][2][13] = nw_char; } /* []-mode insert nw_char */ delimit_scrap[1][2][0] = nw_char; /* ()-mode insert nw_char */ delimit_scrap[2][2][0] = nw_char; } @| update_delimit_scrap @} @d Function prototypes @{void update_delimit_scrap(); @} @d Expand tab into spaces @{{ int delta = 8 - (indent % 8); indent += delta; while (delta > 0) { putc(' ', file); delta--; } }@} @d Check at-sequence... @{{ c = source_get(); switch (c) { case 'c': @ break; case 'x': @ break; case 'v': @ case 's': break; case '+': case '-': case '*': case '|': @ case ',': case ')': case ']': case '}': fputs(delimit_scrap[scrap_type][1], file); return; case '<': @ break; case '%': @ break; case '_': @ break; case 't': @ break; case 'f': @ break; case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': if (name == NULL || name->arg[c - '1'] == NULL) { fputs(delimit_scrap[scrap_type][2], file); fputc(c, file); } else { fputs(delimit_scrap[scrap_type][1], file); write_arg(file, name->arg[c - '1']); fputs(delimit_scrap[scrap_type][0], file); } break; default: if (c==nw_char) { fputs(delimit_scrap[scrap_type][2], file); break; } /* ignore these since pass1 will have warned about them */ break; } }@} @d Copy version info into file @{fputs(version_string, file); @} @d Copy label from source into @{{ @ write_label(label_name, @1); }@} There's no need to check for errors here, since we will have already pointed out any during the first pass. @d Skip over index entries @{{ do { do c = source_get(); while (c != nw_char); c = source_get(); } while (c != '}' && c != ']' && c != ')' ); }@} @d Skip commented-out code... @{{ do c = source_get(); while (c != '\n'); }@} This scrap helps deal with bold keywords: @d Bold Keyword @{{ fputs(delimit_scrap[scrap_type][1],file); fprintf(file, "\\hbox{\\sffamily\\bfseries "); c = source_get(); do { fputc(c, file); c = source_get(); } while (c != nw_char); c = source_get(); fprintf(file, "}"); fputs(delimit_scrap[scrap_type][0], file); }@} @d Italic "@'whatever@'" @{{ fputs(delimit_scrap[scrap_type][1],file); fprintf(file, "\\hbox{\\sffamily\\slshape @1}"); fputs(delimit_scrap[scrap_type][0], file); }@} @d Format macro name @{{ Arglist *args = collect_scrap_name(-1); Name *name = args->name; char * p = name->spelling; Arglist *q = args->args; int narg = 0; fputs(delimit_scrap[scrap_type][1],file); if (prefix) fputs("\\hbox{", file); fputs("$\\langle\\,${\\itshape ", file); while (*p != '\000') { if (*p == ARG_CHR) { if (q == NULL) { write_literal(file, name->arg[narg], scrap_type); } else { write_ArglistElement(file, q, params); q = q->next; } p++; narg++; } else fputc(*p++, file); } fputs("}\\nobreak\\ ", file); if (scrap_name_has_parameters) { @ } fprintf(file, "{\\footnotesize "); if (name->defs) @ else { putc('?', file); fprintf(stderr, "%s: never defined <%s>\n", command_name, name->spelling); } fputs("}$\\,\\rangle$", file); if (prefix) fputs("}", file); fputs(delimit_scrap[scrap_type][0], file); }@} @d Write abbreviated definition list @{{ Scrap_Node *p = name->defs; fputs("\\NWlink{nuweb", file); write_single_scrap_ref(file, p->scrap); fputs("}{", file); write_single_scrap_ref(file, p->scrap); fputs("}", file); p = p->next; if (p) fputs(", \\ldots\\ ", file); }@} @o latex.c -cc -d @{static void write_ArglistElement(FILE * file, Arglist * args, char ** params) { Name *name = args->name; Arglist *q = args->args; if (name == NULL) { char * p = (char*)q; if (p[0] == ARG_CHR) { write_arg(file, params[p[1] - '1']); } else { write_literal(file, (char *)q, 0); } } else if (name == (Name *)1) { Scrap_Node * qq = (Scrap_Node *)q; qq->quoted = TRUE; fputs(delimit_scrap[scrap_type][0], file); write_scraps(file, "", qq, -1, "", 0, 0, 0, 0, NULL, params, 0, ""); fputs(delimit_scrap[scrap_type][1], file); extra_scraps++; qq->quoted = FALSE; } else { char * p = name->spelling; fputs("$\\langle\\,${\\itshape ", file); while (*p != '\000') { if (*p == ARG_CHR) { write_ArglistElement(file, q, params); q = q->next; p++; } else fputc(*p++, file); } fputs("}\\nobreak\\ ", file); if (scrap_name_has_parameters) { int c; @ } fprintf(file, "{\\footnotesize "); if (name->defs) @ else { putc('?', file); fprintf(stderr, "%s: never defined <%s>\n", command_name, name->spelling); } fputs("}$\\,\\rangle$", file); } } @| write_ArglistElement @} \subsection{Generating the Indices} @d Write index of file names @{{ if (file_names) { fputs("\n{\\small\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}", tex_file); fputs("\\setlength{\\itemindent}{-\\leftmargin}}\n", tex_file); format_file_entry(file_names, tex_file); fputs("\\end{list}}", tex_file); } c = source_get(); }@} @o latex.c -cc -d @{static void format_file_entry(Name *name, FILE *tex_file) { while (name) { format_file_entry(name->llink, tex_file); @ name = name->rlink; } } @| format_file_entry @} @d Format a file index entry @{fputs("\\item ", tex_file); fprintf(tex_file, "\\verb%c\"%s\"%c ", nw_char, name->spelling, nw_char); @ putc('\n', tex_file);@} @d Write file's defining scrap numbers @{{ Scrap_Node *p = name->defs; fputs("{\\footnotesize {\\NWtxtDefBy}", tex_file); if (p->next) { /* fputs("s ", tex_file); */ putc(' ', tex_file); print_scrap_numbers(tex_file, p); } else { putc(' ', tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, p->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, p->scrap); fputs("}", tex_file); putc('.', tex_file); } putc('}', tex_file); }@} @d Write index of macro names @{{ unsigned char sector = current_sector; int c = source_get(); if (c == '+') sector = 0; else source_ungetc(&c); if (has_sector(macro_names, sector)) { fputs("\n{\\small\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}", tex_file); fputs("\\setlength{\\itemindent}{-\\leftmargin}}\n", tex_file); format_entry(macro_names, tex_file, sector); fputs("\\end{list}}", tex_file); } else { fputs("None.\n", tex_file); } } c = source_get(); @} @o latex.c -cc -d @{static int load_entry(Name * name, Name ** nms, int n) { while (name) { n = load_entry(name->llink, nms, n); nms[n++] = name; name = name->rlink; } return n; } @| load_entry @} @o latex.c -cc -d @{static void format_entry(Name *name, FILE *tex_file, unsigned char sector) { Name ** nms = malloc(num_scraps()*sizeof(Name *)); int n = load_entry(name, nms, 0); int i; @@> for (i = 0; i < n; i++) { Name * name = nms[i]; @ } } @| format_entry @} @d Rob's... @{robs_strcmp(ki->spelling, kj->spelling) < 0@} @d Sort @'key@' of size @'n@' for @'ordering@' @{int j; for (j = 1; j < @2; j++) { int i = j - 1; Name * kj = @1[j]; do { Name * ki = @1[i]; if (@3) break; @1[i + 1] = ki; i -= 1; } while (i >= 0); @1[i + 1] = kj; } @} @d Format an index entry @{if (name->sector == sector){ fputs("\\item ", tex_file); fputs("$\\langle\\,$", tex_file); @ fputs("\\nobreak\\ {\\footnotesize ", tex_file); @ fputs("}$\\,\\rangle$ ", tex_file); @ putc('\n', tex_file); }@} @d Write defining scrap numbers @{{ Scrap_Node *p = name->defs; if (p) { int page; fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, p->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, p->scrap, TRUE, &page); fputs("}", tex_file); p = p->next; while (p) { fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, p->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, p->scrap, FALSE, &page); fputs("}", tex_file); p = p->next; } } else putc('?', tex_file); }@} @d Write referencing scrap numbers @{{ Scrap_Node *p = name->uses; fputs("{\\footnotesize ", tex_file); if (p) { fputs("{\\NWtxtRefIn}", tex_file); if (p->next) { /* fputs("s ", tex_file); */ putc(' ', tex_file); print_scrap_numbers(tex_file, p); } else { putc(' ', tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, p->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, p->scrap); fputs("}", tex_file); putc('.', tex_file); } } else fputs("{\\NWtxtNoRef}.", tex_file); putc('}', tex_file); }@} @d Function prototypes @{extern int has_sector(Name *, unsigned char); @} @o latex.c -cc -d @{int has_sector(Name * name, unsigned char sector) { while(name) { if (name->sector == sector) return TRUE; if (has_sector(name->llink, sector)) return TRUE; name = name->rlink; } return FALSE; } @| has_sector @} @d Write index of user-specified names @{{ unsigned char sector = current_sector; c = source_get(); if (c == '+') { sector = 0; c = source_get(); } if (has_sector(user_names, sector)) { fputs("\n{\\small\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}", tex_file); fputs("\\setlength{\\itemindent}{-\\leftmargin}}\n", tex_file); format_user_entry(user_names, tex_file, sector); fputs("\\end{list}}", tex_file); } }@} @o latex.c -cc -d @{static void format_user_entry(Name *name, FILE *tex_file, unsigned char sector) { while (name) { format_user_entry(name->llink, tex_file, sector); @ name = name->rlink; } } @| format_user_entry @} @D Format a user index entry @{if (name->sector == sector){ Scrap_Node *uses = name->uses; if ( uses || dangling_flag ) { int page; Scrap_Node *defs = name->defs; fprintf(tex_file, "\\item \\verb%c%s%c: ", nw_char,name->spelling,nw_char); if (!uses) { fputs("(\\underline{", tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("})}", tex_file); page = -2; defs = defs->next; } else if (!defs || uses->scrap < defs->scrap) { fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, uses->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, uses->scrap, TRUE, &page); fputs("}", tex_file); uses = uses->next; } else { if (defs->scrap == uses->scrap) uses = uses->next; fputs("\\underline{", tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("}}", tex_file); page = -2; defs = defs->next; } while (uses || defs) { if (uses && (!defs || uses->scrap < defs->scrap)) { fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, uses->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, uses->scrap, FALSE, &page); fputs("}", tex_file); uses = uses->next; } else { if (uses && defs->scrap == uses->scrap) uses = uses->next; fputs(", \\underline{", tex_file); fputs("\\NWlink{nuweb", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("}{", tex_file); write_single_scrap_ref(tex_file, defs->scrap); fputs("}", tex_file); putc('}', tex_file); page = -2; defs = defs->next; } } fputs(".\n", tex_file); } }@} \section{Writing the LaTeX File with HTML Scraps} \label{html-file} The HTML generated is patterned closely upon the {\LaTeX} generated in the previous section.\footnote{\relax While writing this section, I tried to follow Preston's style as displayed in Section~\ref{latex-file}---J. D. R.} When a file name ends in \verb|.hw|, the second pass (invoked via a call to \verb|write_html|) copies most of the text from the source file straight into a \verb|.tex| file. Definitions are formatted slightly and cross-reference information is printed out. @d Function... @{extern void write_html(char *file_name, char *html_name); @} We need a few local function declarations before we get into the body of \verb|write_html|. @o html.c @{static void copy_scrap(FILE *file, int prefix); /* formats the body of a scrap */ static void display_scrap_ref(FILE *html_file, int num); /* formats a scrap reference */ static void display_scrap_numbers(FILE *html_file, Scrap_Node *scraps); /* formats a list of scrap numbers */ static void print_scrap_numbers(FILE *html_file, Scrap_Node *scraps); /* pluralizes scrap formats list */ static void format_entry(Name *name, FILE *html_file, int file_flag); /* formats an index entry */ static void format_user_entry(Name *name, FILE *html_file, int sector); @} The routine \verb|write_html| takes two file names as parameters: the name of the web source file and the name of the \verb|.tex| output file. @o html.c @{void write_html(char *file_name, char *html_name) { FILE *html_file = fopen(html_name, "w"); FILE *tex_file = html_file; @ if (html_file) { if (verbose_flag) fprintf(stderr, "writing %s\n", html_name); source_open(file_name); @ fclose(html_file); } else fprintf(stderr, "%s: can't open %s\n", command_name, html_name); } @| write_html @} We make our second (and final) pass through the source web, this time copying characters straight into the \verb|.tex| file. However, we keep an eye peeled for \verb|@@|~characters, which signal a command sequence. @d Copy \verb|source_file| into \verb|html_file| @{{ int c = source_get(); while (c != EOF) { if (c == nw_char) @ else { putc(c, html_file); c = source_get(); } } }@} @d Interpret HTML at-sequence @{{ c = source_get(); switch (c) { case 'r': c = source_get(); nw_char = c; update_delimit_scrap(); break; case 'O': case 'o': @ break; case 'Q': case 'q': case 'D': case 'd': @ break; case 'f': @ break; case 'm': @ break; case 'u': @ break; default: if (c==nw_char) putc(c, html_file); c = source_get(); break; } }@} \subsection{Formatting Definitions} We go through only a little amount of effort to format a definition. The HTML for the previous fragment definition should look like this (perhaps modulo the scrap references): \begin{verbatim}
<Interpret HTML at-sequence 68> =
{
  c = source_get();
  switch (c) {
    case 'O':
    case 'o': <Write HTML output file definition 69>
              break;
    case 'D':
    case 'd': <Write HTML macro definition 71>
              break;
    case 'f': <Write HTML index of file names 86>
              break;
    case 'm': <Write HTML index of macro names 87>
              break;
    case 'u': <Write HTML index of user-specified names 93>
              break;
    default:
         if (c==nw_char)
           putc(c, html_file);
         c = source_get();
              break;
  }
}<>
Fragment referenced in scrap 67.
\end{verbatim} Fragment and file definitions are formatted nearly identically. I've factored the common parts out into separate scraps. @d Write HTML output file definition @{{ Name *name = collect_file_name(); @ @ scraps++; @ @ @ }@} @d Write HTML output file declaration @{ fputs("\"%s\" ", name->spelling); write_single_scrap_ref(html_file, scraps); fputs(" =\n", html_file); @} @d Write HTML macro definition @{{ Name *name = collect_macro_name(); @ @ scraps++; @ @ @ @ }@} I don't format a fragment name at all specially, figuring the programmer might want to use italics or bold face in the midst of the name. Note that in this implementation, programmers may only use directives in fragment names that are recognized in preformatted text elements (PRE). Modification 2001--02--15.: I'm interpreting the fragment name as regular LaTex, so that any formatting can be used in it. To use HTML formatting, the \verb|rawhtml| environment should be used. @d Write HTML macro declaration @{ fputs("<\\end{rawhtml}", html_file); fputs(name->spelling, html_file); fputs("\\begin{rawhtml} ", html_file); write_single_scrap_ref(html_file, scraps); fputs("> =\n", html_file); @} @d Begin HTML scrap environment @{{ fputs("\\begin{rawhtml}\n", html_file); fputs("
\n", html_file);
}@}

The end of a scrap is marked with the characters \verb|<>|.
@d Fill in the middle of HTML scrap environment
@{{
  copy_scrap(html_file, TRUE);
  fputs("<>
\n", html_file); }@} The only task remaining is to get rid of the current at command and end the paragraph. @d Finish HTML scrap environment @{{ fputs("\\end{rawhtml}\n", html_file); c = source_get(); /* Get rid of current at command. */ }@} \subsubsection{Formatting Cross References} @d Write HTML file defs @{{ if (name->defs->next) { fputs("\\end{rawhtml}\\NWtxtFileDefBy\\begin{rawhtml} ", html_file); print_scrap_numbers(html_file, name->defs); fputs("
\n", html_file); } }@} @d Write HTML macro defs @{{ if (name->defs->next) { fputs("\\end{rawhtml}\\NWtxtMacroDefBy\\begin{rawhtml} ", html_file); print_scrap_numbers(html_file, name->defs); fputs("
\n", html_file); } }@} @d Write HTML macro refs @{{ if (name->uses) { fputs("\\end{rawhtml}\\NWtxtMacroRefIn\\begin{rawhtml} ", html_file); print_scrap_numbers(html_file, name->uses); } else { fputs("\\end{rawhtml}{\\NWtxtMacroNoRef}.\\begin{rawhtml}", html_file); fprintf(stderr, "%s: <%s> never referenced.\n", command_name, name->spelling); } fputs("
\n", html_file); }@} @o html.c @{static void display_scrap_ref(FILE *html_file, int num) { fputs("", html_file); write_single_scrap_ref(html_file, num); fputs("", html_file); } @| display_scrap_ref @} @o html.c @{static void display_scrap_numbers(FILE *html_file, Scrap_Node *scraps) { display_scrap_ref(html_file, scraps->scrap); scraps = scraps->next; while (scraps) { fputs(", ", html_file); display_scrap_ref(html_file, scraps->scrap); scraps = scraps->next; } } @| display_scrap_numbers @} @o html.c @{static void print_scrap_numbers(FILE *html_file, Scrap_Node *scraps) { display_scrap_numbers(html_file, scraps); fputs(".\n", html_file); } @| print_scrap_numbers @} \subsubsection{Formatting a Scrap} We must translate HTML special keywords into entities in scraps. @o html.c @{static void copy_scrap(FILE *file, int prefix) { int indent = 0; int c = source_get(); while (1) { switch (c) { case '<' : fputs("<", file); indent++; break; case '>' : fputs(">", file); indent++; break; case '&' : fputs("&", file); indent++; break; case '\n': fputc(c, file); indent = 0; break; case '\t': @ break; default: if (c==nw_char) { @ break; } putc(c, file); indent++; break; } c = source_get(); } } @| copy_scrap @} @d Check HTML at-sequence... @{{ c = source_get(); switch (c) { case '+': case '-': case '*': case '|': @ case ',': case '}': case ']': case ')': return; case '_': @ break; case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': fputc(nw_char, file); fputc(c, file); break; case '<': @ break; case '%': @ break; default: if (c==nw_char) { fputc(c, file); break; } /* ignore these since pass1 will have warned about them */ break; } }@} There's no need to check for errors here, since we will have already pointed out any during the first pass. @d Format HTML macro name @{{ Arglist * args = collect_scrap_name(-1); Name *name = args->name; fputs("<\\end{rawhtml}", file); fputs(name->spelling, file); if (scrap_name_has_parameters) { @ } fputs("\\begin{rawhtml} ", file); if (name->defs) @ else { putc('?', file); fprintf(stderr, "%s: never defined <%s>\n", command_name, name->spelling); } fputs(">", file); }@} @d Write HTML abbreviated definition list @{{ Scrap_Node *p = name->defs; display_scrap_ref(file, p->scrap); if (p->next) fputs(", ... ", file); }@} \subsection{Generating the Indices} @d Write HTML index of file names @{{ if (file_names) { fputs("\\begin{rawhtml}\n", html_file); fputs("
\n", html_file); format_entry(file_names, html_file, TRUE); fputs("
\n", html_file); fputs("\\end{rawhtml}\n", html_file); } c = source_get(); }@} @d Write HTML index of macro names @{{ if (macro_names) { fputs("\\begin{rawhtml}\n", html_file); fputs("
\n", html_file); format_entry(macro_names, html_file, FALSE); fputs("
\n", html_file); fputs("\\end{rawhtml}\n", html_file); } c = source_get(); }@} @d Write HTML bold tag or end @{{ static int toggle; toggle = ~toggle; if( toggle ) { fputs( "", file ); } else { fputs( "", file ); } }@} @o html.c @{static void format_entry(Name *name, FILE *html_file, int file_flag) { while (name) { format_entry(name->llink, html_file, file_flag); @ name = name->rlink; } } @| format_entry @} @d Format an HTML index entry @{{ fputs("
", html_file); if (file_flag) { fprintf(html_file, "\"%s\"\n
", name->spelling); @ } else { fputs("<\\end{rawhtml}", html_file); fputs(name->spelling, html_file); fputs("\\begin{rawhtml} ", html_file); @ fputs(">\n
", html_file); @ } putc('\n', html_file); }@} @d Write HTML file's defining scrap numbers @{{ fputs("\\end{rawhtml}\\NWtxtDefBy\\begin{rawhtml} ", html_file); print_scrap_numbers(html_file, name->defs); }@} @d Write HTML defining scrap numbers @{{ if (name->defs) display_scrap_numbers(html_file, name->defs); else putc('?', html_file); }@} @d Write HTML referencing scrap numbers @{{ Scrap_Node *p = name->uses; if (p) { fputs("\\end{rawhtml}\\NWtxtRefIn\\begin{rawhtml} ", html_file); print_scrap_numbers(html_file, p); } else fputs("\\end{rawhtml}{\\NWtxtNoRef}.\\begin{rawhtml}", html_file); }@} @d Write HTML index of user-specified names @{{ if (user_names) { fputs("\\begin{rawhtml}\n", html_file); fputs("
\n", html_file); format_user_entry(user_names, html_file, 0/* Dummy */); fputs("
\n", html_file); fputs("\\end{rawhtml}\n", html_file); } c = source_get(); }@} @o html.c @{static void format_user_entry(Name *name, FILE *html_file, int sector) { while (name) { format_user_entry(name->llink, html_file, sector); @ name = name->rlink; } } @| format_user_entry @} @d Format a user HTML index entry @{{ Scrap_Node *uses = name->uses; if (uses) { Scrap_Node *defs = name->defs; fprintf(html_file, "
%s:\n
", name->spelling); if (uses->scrap < defs->scrap) { display_scrap_ref(html_file, uses->scrap); uses = uses->next; } else { if (defs->scrap == uses->scrap) uses = uses->next; fputs("", html_file); display_scrap_ref(html_file, defs->scrap); fputs("", html_file); defs = defs->next; } while (uses || defs) { fputs(", ", html_file); if (uses && (!defs || uses->scrap < defs->scrap)) { display_scrap_ref(html_file, uses->scrap); uses = uses->next; } else { if (uses && defs->scrap == uses->scrap) uses = uses->next; fputs("", html_file); display_scrap_ref(html_file, defs->scrap); fputs("", html_file); defs = defs->next; } } fputs(".\n", html_file); } }@} \section{Writing the Output Files} \label{output-files} @d Function pro... @{extern void write_files(Name *files); @} @o output.c -cc -d @{void write_files(Name *files) { while (files) { write_files(files->llink); @spelling|@> files = files->rlink; } } @| write_files @} \verb|MAX_INDENT| defines the maximum number of leading whitespace characters. This is only a problem when outputting very long lines, possibly by multiple definitions of the same fragment (as in a list of elements which is added to every time a new element is defined). @d Type dec... @{ #define MAX_INDENT 500 @| MAX_INDENT @} @d Write out \verb|files->spelling| @{{ static char temp_name[FILENAME_MAX]; static char real_name[FILENAME_MAX]; static int temp_name_count = 0; char indent_chars[MAX_INDENT]; FILE *temp_file; @< Find a free temporary file @> sprintf(real_name, "%s%s%s", dirpath, path_sep, files->spelling); if (verbose_flag) fprintf(stderr, "writing %s [%s]\n", files->spelling, temp_name); write_scraps(temp_file, files->spelling, files->defs, 0, indent_chars, files->debug_flag, files->tab_flag, files->indent_flag, files->comment_flag, NULL, NULL, 0, files->spelling); fclose(temp_file); @< Move the temporary file to the target, if required @> }@} @d Find a free temporary file @{@% for( temp_name_count = 0; temp_name_count < 10000; temp_name_count++) { sprintf(temp_name,"%s%snw%06d", dirpath, path_sep, temp_name_count); #ifdef O_EXCL if (-1 != (temp_file_fd = open(temp_name, O_CREAT|O_WRONLY|O_EXCL))) { temp_file = fdopen(temp_file_fd, "w"); break; } #else if (0 != (temp_file = fopen(temp_name, "a"))) { if ( 0L == ftell(temp_file)) { break; } else { fclose(temp_file); temp_file = 0; } } #endif } if (!temp_file) { fprintf(stderr, "%s: can't create %s for a temporary file\n", command_name, temp_name); exit(-1); } @} Note the call to \verb|remove| before \verb|rename| -- the ANSI/ISO C standard does {\em not} guarantee that renaming a file to an existing filename will overwrite the file. @d Move the temporary file to the target, if required @{@% if (compare_flag) @ else { remove(real_name); @< Rename the temporary file to the target @> } @} Again, we use a call to \verb|remove| before \verb|rename|. @d Compare the temp file... @{{ FILE *old_file = fopen(real_name, "r"); if (old_file) { int x, y; temp_file = fopen(temp_name, "r"); do { x = getc(old_file); y = getc(temp_file); } while (x == y && x != EOF); fclose(old_file); fclose(temp_file); if (x == y) remove(temp_name); else { remove(real_name); @< Rename the temporary file to the target @> } } else @< Rename the temporary file to the target @> }@} @d Rename the temporary file to the target @{@% if (0 != rename(temp_name, real_name)) { fprintf(stderr, "%s: can't rename output file to %s\n", command_name, real_name); } @|@} \chapter{The Support Routines} \section{Source Files} \label{source-files} \subsection{Global Declarations} We need two routines to handle reading the source files. @d Function pro... @{extern void source_open(char *name); /* pass in the name of the source file */ extern int source_get(void); /* no args; returns the next char or EOF */ extern int source_last; /* what last source_get() returned. */ extern int source_peek; /* The next character to get */ @} There are also two global variables maintained for use in error messages and such. @d Global variable dec... @{extern char *source_name; /* name of the current file */ extern int source_line; /* current line in the source file */ @| source_name source_line @} @d Global variable def... @{char *source_name = NULL; int source_line = 0; @} \subsection{Local Declarations} @o input.c -cc -d @{static FILE *source_file; /* the current input file */ static int double_at; static int include_depth; @| source_file double_at include_depth @} @o input.c -cc -d @{static struct { FILE *file; char *name; int line; } stack[10]; @| stack @} \subsection{Reading a File} The routine \verb|source_get| returns the next character from the current source file. It notices newlines and keeps the line counter \verb|source_line| up to date. It also catches \verb|EOF| and watches for \verb|@@|~characters. All other characters are immediately returned. We define \verb|source_last| to let us tell which type of scrap we are defining. @o input.c -cc -d @{ int source_peek; int source_last; int source_get(void) { int c; source_last = c = source_peek; switch (c) { case EOF: @ return c; case '\n': source_line++; default: if (c==nw_char) { @ return c; } source_peek = getc(source_file); return c; } } @| source_peek source_get source_last @} \verb|source_ungetc| pushes a read character back to the \verb|source_file|. @d Function prototypes @{extern void source_ungetc(int*); @} @o input.c -cc -d @{void source_ungetc(int *c) { ungetc(source_peek, source_file); if(*c == '\n') source_line--; source_peek=*c; } @|source_ungetc @} This whole \verb|@@|~character handling mess is pretty annoying. I want to recognize \verb|@@i| so I can handle include files correctly. At the same time, it makes sense to recognize illegal \verb|@@|~sequences and complain; this avoids ever having to check anywhere else. Unfortunately, I need to avoid tripping over the \verb|@@@@|~sequence; hence this whole unsatisfactory \verb|double_at| business. @d Handle an ``at''... @{{ c = getc(source_file); if (double_at) { source_peek = c; double_at = FALSE; c = nw_char; } else switch (c) { case 'i': @ break; case '#': case 'f': case 'm': case 'u': case 'v': case 'd': case 'o': case 'D': case 'O': case 's': case 'q': case 'Q': case 'S': case 't': case '+': case '-': case '*': case '\'': case '{': case '}': case '<': case '>': case '|': case '(': case ')': case '[': case ']': case '%': case '_': case ':': case ',': case 'x': case 'c': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case 'r': source_peek = c; c = nw_char; break; default: if (c==nw_char) { source_peek = c; double_at = TRUE; break; } fprintf(stderr, "%s: bad %c sequence %c[%d] (%s, line %d)\n", command_name, nw_char, c, c, source_name, source_line); exit(-1); } }@} @d Open an include file @{{ char name[FILENAME_MAX]; char fullname[FILENAME_MAX]; struct incl * p = include_list; if (include_depth >= 10) { fprintf(stderr, "%s: include nesting too deep (%s, %d)\n", command_name, source_name, source_line); exit(-1); } @ stack[include_depth].file = source_file; fullname[0] = '\0'; for (;;) { strcat(fullname, name); source_file = fopen(fullname, "r"); if (source_file || !p) break; strcpy(fullname, p->name); strcat(fullname, "/"); p = p->next; } if (!source_file) { fprintf(stderr, "%s: can't open include file %s\n", command_name, name); source_file = stack[include_depth].file; } else { stack[include_depth].name = source_name; stack[include_depth].line = source_line + 1; include_depth++; source_line = 1; source_name = save_string(fullname); } source_peek = getc(source_file); c = source_get(); }@} @d Collect include-file name @{{ char *p = name; do c = getc(source_file); while (c == ' ' || c == '\t'); while (isgraph(c)) { *p++ = c; c = getc(source_file); } *p = '\0'; if (c != '\n') { fprintf(stderr, "%s: unexpected characters after file name (%s, %d)\n", command_name, source_name, source_line); exit(-1); } }@} If an \verb|EOF| is discovered, the current file must be closed and input from the next stacked file must be resumed. If no more files are on the stack, the \verb|EOF| is returned. @d Handle \verb|EOF| @{{ fclose(source_file); if (include_depth) { include_depth--; source_file = stack[include_depth].file; source_line = stack[include_depth].line; source_name = stack[include_depth].name; source_peek = getc(source_file); c = source_get(); } }@} \subsection{Opening a File} The routine \verb|source_open| takes a file name and tries to open the file. If unsuccessful, it complains and halts. Otherwise, it sets \verb|source_name|, \verb|source_line|, and \verb|double_at|. @o input.c -cc -d @{void source_open(char *name) { source_file = fopen(name, "r"); if (!source_file) { fprintf(stderr, "%s: couldn't open %s\n", command_name, name); exit(-1); } nw_char = '@@'; source_name = name; source_line = 1; source_peek = getc(source_file); double_at = FALSE; include_depth = 0; } @| source_open @} \section{Scraps} \label{scraps} @o scraps.c -cc -d @{#define SLAB_SIZE 1024 typedef struct slab { struct slab *next; char chars[SLAB_SIZE]; } Slab; @| Slab next SLAB_SIZE @} @o scraps.c -cc -d @{typedef struct { char *file_name; Slab *slab; struct uses *uses; struct uses *defs; int file_line; int page; char letter; unsigned char sector; } ScrapEntry; @| file_name slab uses defs file_line page letter sector ScrapEntry @} There's some issue with reading the auxiliary file if it has more than 255 lines, so (as a workround only) increase the size of the array of \verb|ScrapEntry|s held in \verb|SCRAP[n]|. @o scraps.c -cc -d @{ #define SCRAP_BITS 10 #define SCRAP_SIZE (1<> SCRAP_SHIFT][(i) & SCRAP_MASK] static int scraps; int num_scraps(void) { return scraps; }; @ @| num_scraps scraps scrap_array SCRAP_BITS SCRAP_SIZE SCRAP_MASK SCRAP_SHIFT SCRAP @} @d Function pro... @{extern void init_scraps(void); extern int collect_scrap(void); extern int write_scraps(FILE *file, char *spelling, Scrap_Node *defs, int global_indent, char *indent_chars, char debug_flag, char tab_flag, char indent_flag, unsigned char comment_flag, Arglist *inArgs, char *inParams[9], Parameters parameters, char *title); extern void write_scrap_ref(FILE *file, int num, int first, int *page); extern void write_single_scrap_ref(FILE *file, int num); extern int num_scraps(void); @} @o scraps.c -cc -d @{void init_scraps(void) { scraps = 1; SCRAP[0] = (ScrapEntry *) arena_getmem(SCRAP_SIZE * sizeof(ScrapEntry)); } @| init_scraps @} @o scraps.c -cc -d @{void write_scrap_ref(FILE *file, int num, int first, int *page) { if (scrap_array(num).page >= 0) { if (first!=0) fprintf(file, "%d", scrap_array(num).page); else if (scrap_array(num).page != *page) fprintf(file, ", %d", scrap_array(num).page); if (scrap_array(num).letter > 0) fputc(scrap_array(num).letter, file); } else { if (first!=0) putc('?', file); else fputs(", ?", file); @ } if (first>=0) *page = scrap_array(num).page; } @| write_scrap_ref @} @o scraps.c -cc -d @{void write_single_scrap_ref(FILE *file, int num) { int page; write_scrap_ref(file, num, TRUE, &page); } @| write_single_scrap_ref @} @d Warn (only once) about needing to... @{{ if (!already_warned) { fprintf(stderr, "%s: you'll need to rerun nuweb after running latex\n", command_name); already_warned = TRUE; } }@} @d Global variable dec... @{extern int already_warned; @| already_warned @} @d Global variable def... @{int already_warned = 0; @} @o scraps.c -cc -d @{typedef struct { Slab *scrap; Slab *prev; int index; } Manager; @| Manager @} @o scraps.c -cc -d @{static void push(char c, Manager *manager) { Slab *scrap = manager->scrap; int index = manager->index; scrap->chars[index++] = c; if (index == SLAB_SIZE) { Slab *new = (Slab *) arena_getmem(sizeof(Slab)); scrap->next = new; manager->scrap = new; index = 0; } manager->index = index; } @| push @} @o scraps.c -cc -d @{static void pushs(char *s, Manager *manager) { while (*s) push(*s++, manager); } @| pushs @} @o scraps.c -cc -d @{int collect_scrap(void) { int current_scrap, lblseq = 0; int depth = 1; Manager writer; @ @ } @| collect_scrap @} @d Create new scrap, managed by \verb|writer| @{{ Slab *scrap = (Slab *) arena_getmem(sizeof(Slab)); if ((scraps & SCRAP_MASK) == 0) SCRAP[scraps >> SCRAP_SHIFT] = (ScrapEntry *) arena_getmem(SCRAP_SIZE * sizeof(ScrapEntry)); scrap_array(scraps).slab = scrap; scrap_array(scraps).file_name = save_string(source_name); scrap_array(scraps).file_line = source_line; scrap_array(scraps).page = -1; scrap_array(scraps).letter = 0; scrap_array(scraps).uses = NULL; scrap_array(scraps).defs = NULL; scrap_array(scraps).sector = current_sector; writer.scrap = scrap; writer.index = 0; current_scrap = scraps++; }@} @d Accumulate scrap... @{{ int c = source_get(); while (1) { switch (c) { case EOF: fprintf(stderr, "%s: unexpect EOF in (%s, %d)\n", command_name, scrap_array(current_scrap).file_name, scrap_array(current_scrap).file_line); exit(-1); default: if (c==nw_char) { @ break; } push(c, &writer); c = source_get(); break; } } }@} @d Handle at-sign during scrap accumulation @{{ c = source_get(); switch (c) { case '(': case '[': case '{': depth++; break; case '+': case '-': case '*': case '|': @ /* Fall through */ case ')': case ']': case '}': if (--depth > 0) break; /* else fall through */ case ',': push('\0', &writer); scrap_ended_with = c; return current_scrap; case '<': @ break; case '%': @ /* emit line break to the output file to keep #line in sync. */ push('\n', &writer); c = source_get(); break; case 'x': @ break; case 'c': @ break; case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case 'f': case '#': case 'v': case 't': case 's': push(nw_char, &writer); break; case '_': c = source_get(); break; default : if (c==nw_char) { push(nw_char, &writer); push(nw_char, &writer); c = source_get(); break; } fprintf(stderr, "%s: unexpected %c%c in scrap (%s, %d)\n", command_name, nw_char, c, source_name, source_line); exit(-1); } }@} @d Get label while collecting scrap @{{ @ @ push(nw_char, &writer); push('x', &writer); pushs(label_name, &writer); push(nw_char, &writer); }@} @d Collect user-specified index entries @{{ do { int type = c; do { char new_name[MAX_NAME_LEN]; char *p = new_name; unsigned int sector = 0; do c = source_get(); while (isspace(c)); if (c != nw_char) { Name *name; do { *p++ = c; c = source_get(); } while (c != nw_char && !isspace(c)); *p = '\0'; switch (type) { case '*': sector = current_sector; @ break; case '-': @ /* Fall through */ case '|': sector = current_sector; /* Fall through */ case '+': @ break; } } } while (c != nw_char); c = source_get(); }while (c == '|' || c == '*' || c == '-' || c == '+'); if (c != '}' && c != ']' && c != ')') { fprintf(stderr, "%s: unexpected %c%c in index entry (%s, %d)\n", command_name, nw_char, c, source_name, source_line); exit(-1); } }@} @d Add user identifier use @{name = name_add(&user_names, new_name, sector); if (!name->uses || name->uses->scrap != current_scrap) { Scrap_Node *use = (Scrap_Node *) arena_getmem(sizeof(Scrap_Node)); use->scrap = current_scrap; use->next = name->uses; name->uses = use; add_uses(&(scrap_array(current_scrap).uses), name); }@} @d Add user identifier definition @{name = name_add(&user_names, new_name, sector); if (!name->defs || name->defs->scrap != current_scrap) { Scrap_Node *def = (Scrap_Node *) arena_getmem(sizeof(Scrap_Node)); def->scrap = current_scrap; def->next = name->defs; name->defs = def; add_uses(&(scrap_array(current_scrap).defs), name); }@} @d Handle macro invocation in scrap @{{ Arglist * args = collect_scrap_name(current_scrap); Name *name = args->name; @ add_to_use(name, current_scrap); if (scrap_name_has_parameters) { @ } push(nw_char, &writer); push('>', &writer); c = source_get(); }@} @d Save macro name @{{ char buff[24]; push(nw_char, &writer); push('<', &writer); push(name->sector, &writer); sprintf(buff, "%p", args); pushs(buff, &writer); }@} @o scraps.c -cc -d @{void add_to_use(Name * name, int current_scrap) { if (!name->uses || name->uses->scrap != current_scrap) { Scrap_Node *use = (Scrap_Node *) arena_getmem(sizeof(Scrap_Node)); use->scrap = current_scrap; use->next = name->uses; name->uses = use; } } @| add_to_use @} @d Function... @{extern void add_to_use(Name * name, int current_scrap); @} @o scraps.c -cc -d @{static char pop(Manager *manager) { Slab *scrap = manager->scrap; int index = manager->index; char c = scrap->chars[index++]; if (index == SLAB_SIZE) { manager->prev = scrap; manager->scrap = scrap->next; index = 0; } manager->index = index; return c; } @| pop @} @o scraps.c -cc -d @{static void backup(int n, Manager *manager) { int index = manager->index; if (n > index && manager->prev != NULL) { manager->scrap = manager->prev; manager->prev = NULL; index += SLAB_SIZE; } manager->index = (n <= index ? index - n : 0); } @| backup @} @o scraps.c -cc -d @{void lookup(int n, Arglist * par, char * arg[9], Name **name, Arglist ** args) { int i; Arglist * p = par; for (i = 0; i < n && p != NULL; i++) p = p->next; if (p == NULL) { char * a = arg[n]; *name = NULL; *args = (Arglist *)a; } else { *name = p->name; *args = p->args; } } @| lookup @} @o scraps.c -cc -d @{Arglist * instance(Arglist * a, Arglist * par, char * arg[9], int * ch) { if (a != NULL) { int changed = 0; Arglist *args, *next; Name* name; @ if (changed){ @ *ch = 1; } } return a; } @| instance @} @d Function prototypes @{extern Arglist *instance(Arglist *a, Arglist *par, char *arg[9], int *ch); @} @d Set up name, args and next @{next = instance(a->next, par, arg, &changed); name = a->name; if (name == (Name *)1) { Embed_Node * q = (Embed_Node *)arena_getmem(sizeof(Embed_Node)); q->defs = (Scrap_Node *)a->args; q->args = par; args = (Arglist *)q; changed = 1; } else if (name != NULL) args = instance(a->args, par, arg, &changed); else { char * p = (char *)a->args; if (p[0] == ARG_CHR) { lookup(p[1] - '1', par, arg, &name, &args); changed = 1; } else { args = a->args; } }@} @d Build a new arglist @{a = (Arglist *)arena_getmem(sizeof(Arglist)); a->name = name; a->args = args; a->next = next;@} @o scraps.c -cc -d @{static Arglist *pop_scrap_name(Manager *manager, Parameters *parameters) { char name[MAX_NAME_LEN]; char *p = name; Arglist * args; int c; (void)pop(manager); /* not sure why we have to pop twice */ c = pop(manager); while (c != nw_char) { *p++ = c; c = pop(manager); } *p = '\000'; if (sscanf(name, "%p", &args) != 1) { fprintf(stderr, "%s: found an internal problem (2)\n", command_name); exit(-1); } @ return args; } @| pop_scrap_name @} @d Check for end of scrap name @{{ c = pop(manager); @ }@} @o scraps.c -cc -d @{int write_scraps(FILE *file, char *spelling, Scrap_Node *defs, int global_indent, char *indent_chars, char debug_flag, char tab_flag, char indent_flag, unsigned char comment_flag, Arglist *inArgs, char *inParams[9], Parameters parameters, char *title) { /* This is in file @f */ int indent = 0; int newline = 1; while (defs) { @scrap| to \verb|file|@> defs = defs->next; } return indent + global_indent; } @| write_scraps @} @d Forward declarations for scraps.c @{int delayed_indent = 0; @| delayed_indent @} @d Copy \verb|defs->scrap... @{{ char c; Manager reader; Parameters local_parameters = 0; int line_number = scrap_array(defs->scrap).file_line; reader.scrap = scrap_array(defs->scrap).slab; reader.index = 0; @ if (delayed_indent) { @ } c = pop(&reader); while (c) { switch (c) { case '\n': if (global_indent >= 0) { putc(c, file); line_number++; newline = 1; delayed_indent = 0; @ break; } else { /* Don't show newlines in embedded fragmants */ fputs(". . .", file); return 0; } case '\t': @ delayed_indent = 0; break; default: if (c==nw_char) { @ break; } putc(c, file); if (global_indent >= 0) { @ } indent++; if (c > ' ') newline = 0; delayed_indent = 0; break; } c = pop(&reader); } }@} We need to make sure that we don't overflow \verb|indent_chars[]|. @d Add more indentation @'char@' @{{ if (global_indent + indent >= MAX_INDENT) { fprintf(stderr, "Error! maximum indentation exceeded in \"%s\".\n", spelling); exit(1); } indent_chars[global_indent + indent] = @1; }@} @d Insert debugging information if required @{if (debug_flag) { fprintf(file, "\n#line %d \"%s\"\n", line_number, scrap_array(defs->scrap).file_name); @ }@} @d Insert approp... @{{ char c1 = pop(&reader); char c2 = pop(&reader); if (indent_flag && !(@)) { @ } indent = 0; backup(2, &reader); }@} @d Put out the indent @{if (tab_flag) for (indent=0; indent else { putc('\t', file); if (global_indent >= 0) { @ } indent++; } }@} @d Check for macro invocation... @{{ int oldin = indent; char oldcf = comment_flag; c = pop(&reader); switch (c) { case 't': @ break; case 'c': @ break; case 'f': @ break; case 'x': @ case '_': break; case 'v': @ break; case 's': indent = -global_indent; comment_flag = 0; break; case '<': @ @ indent = oldin; comment_flag = oldcf; break; @ indent = oldin; break; default: if(c==nw_char) { putc(c, file); if (global_indent >= 0) { @ } indent++; break; } /* ignore, since we should already have a warning */ break; } }@} @d Copy label from scrap into file @{{ @ write_label(label_name, file); }@} @d Copy file name into file @{if (defs->quoted) fprintf(file, "%cf", nw_char); else fputs(spelling, file); @} @d Copy macro into... @{{ Arglist *a = pop_scrap_name(&reader, &local_parameters); Name *name = a->name; int changed; Arglist * args = instance(a->args, inArgs, inParams, &changed); int i, narg; char * p = name->spelling; char * * inParams = name->arg; Arglist *q = args; if (name->mark) { fprintf(stderr, "%s: recursive macro discovered involving <%s>\n", command_name, name->spelling); exit(-1); } if (name->defs && !defs->quoted) { @ name->mark = TRUE; indent = write_scraps(file, spelling, name->defs, global_indent + indent, indent_chars, debug_flag, tab_flag, indent_flag, comment_flag, args, name->arg, local_parameters, name->spelling); indent -= global_indent; name->mark = FALSE; } else { if (delayed_indent) { for (i = indent + global_indent; --i >= 0; ) putc(' ', file); } fprintf(file, "%c<", nw_char); if (name->sector == 0) fputc('+', file); @ fprintf(file, "%c>", nw_char); if (!defs->quoted && !tex_flag) fprintf(stderr, "%s: macro never defined <%s>\n", command_name, name->spelling); } }@} @d Perhaps comment this macro @{if (comment_flag && newline) { @ fputs(comment_begin[comment_flag], file); @ if (xref_flag) { putc(' ', file); write_single_scrap_ref(file, name->defs->scrap); } fputs(comment_end[comment_flag], file); putc('\n', file); if (!delayed_indent) for (i = indent + global_indent; --i >= 0; ) putc(' ', file); } @} @d Perhaps put a delayed indent @{if (delayed_indent) for (i = indent + global_indent; --i >= 0; ) putc(' ', file); @} @d Copy fragment title into file @{{ char * p = title; Arglist *q = inArgs; int narg; @ if (xref_flag) { putc(' ', file); write_single_scrap_ref(file, defs->scrap); } }@} @d Comment this macro use @{narg = 0; while (*p != '\000') { if (*p == ARG_CHR) { if (q == NULL) { if (defs->quoted) fprintf(file, "%c'%s%c'", nw_char, inParams[narg], nw_char); else fprintf(file, "'%s'", inParams[narg]); } else { comment_ArglistElement(file, q, defs->quoted); q = q->next; } p++; narg++; } else fputc(*p++, file); }@} @d Forward declarations for scraps.c @{static void comment_ArglistElement(FILE * file, Arglist * args, int quote) { Name *name = args->name; Arglist *q = args->args; if (name == NULL) { if (quote) fprintf(file, "%c'%s%c'", nw_char, (char *)q, nw_char); else fprintf(file, "'%s'", (char *)q); } else if (name == (Name *)1) { @ } else { @ } } @| comment_ArglistElement @} @d Include an embedded scrap in comment @{Embed_Node * e = (Embed_Node *)q; fputc('{', file); write_scraps(file, "", e->defs, -1, "", 0, 0, 0, 0, e->args, 0, (Parameters)1, ""); fputc('}', file);@} @d Include a fragment use in comment @{char * p = name->spelling; if (quote) fputc(nw_char, file); fputc('<', file); if (quote && name->sector == 0) fputc('+', file); while (*p != '\000') { if (*p == ARG_CHR) { comment_ArglistElement(file, q, quote); q = q->next; p++; } else fputc(*p++, file); } if (quote) fputc(nw_char, file); fputc('>', file);@} \subsection{Collecting Page Numbers} @d Function... @{extern void collect_numbers(char *aux_name); @} @o scraps.c -cc -d @{void collect_numbers(char *aux_name) { if (number_flag) { int i; for (i=1; i } fclose(aux_file); @ } } } @| collect_numbers @} The \textit{memoir} document class creates references in the \verb|.aux| file with nested braces, so after each open brace we have to wait for the matching closing brace. @d Read line in \verb|.aux| file @{@% int scrap_number; int page_number; int i; int dummy_idx; int bracket_depth = 1; if (1 == sscanf(aux_line, "\\newlabel{scrap%d}{{%n", &scrap_number, &dummy_idx)) { for (i = dummy_idx; i < strlen(aux_line) && bracket_depth > 0; i++) { if (aux_line[i] == '{') bracket_depth++; else if (aux_line[i] == '}') bracket_depth--; } if (i > dummy_idx && i < strlen(aux_line) && 1 == sscanf(aux_line+i, "{%d}" ,&page_number)) { if (scrap_number < scraps) scrap_array(scrap_number).page = page_number; else @ } } @|@} @d Add letters to scraps with... @{{ int i = 0; @ @ { int j = i; @ @ i = j; } } @} @d Step @'i@' to the next valid scrap @{do @1++; while (@1 < scraps && scrap_array(@1).page == -1); @} @d For all remaining scraps @{while (i < scraps)@} @d Perhaps add letters to the page numbers @{if (scrap_array(i).page == scrap_array(j).page) { if (scrap_array(i).letter == 0) scrap_array(i).letter = 'a'; scrap_array(j).letter = scrap_array(i).letter + 1; } @} \section{Names} \label{names} @d Type de... @{typedef struct scrap_node { struct scrap_node *next; int scrap; char quoted; } Scrap_Node; @| Scrap_Node @} @d Type de... @{typedef struct name { char *spelling; struct name *llink; struct name *rlink; Scrap_Node *defs; Scrap_Node *uses; char * arg[9]; int mark; char tab_flag; char indent_flag; char debug_flag; unsigned char comment_flag; unsigned char sector; } Name; @| Name @} @d Global variable dec... @{extern Name *file_names; extern Name *macro_names; extern Name *user_names; extern int scrap_name_has_parameters; extern int scrap_ended_with; @| file_names macro_names user_names @} @d Global variable def... @{Name *file_names = NULL; Name *macro_names = NULL; Name *user_names = NULL; int scrap_name_has_parameters; int scrap_ended_with; @} @d Function pro... @{extern Name *collect_file_name(void); extern Name *collect_macro_name(void); extern Arglist *collect_scrap_name(int current_scrap); extern Name *name_add(Name **rt, char *spelling, unsigned char sector); extern Name *prefix_add(Name **rt, char *spelling, unsigned char sector); extern char *save_string(char *); extern void reverse_lists(Name *names); @} @o names.c -cc -d @{enum { LESS, GREATER, EQUAL, PREFIX, EXTENSION }; static int compare(char *x, char *y) { int len, result; int xl = strlen(x); int yl = strlen(y); int xp = x[xl - 1] == ' '; int yp = y[yl - 1] == ' '; if (xp) xl--; if (yp) yl--; len = xl < yl ? xl : yl; result = strncmp(x, y, len); if (result < 0) return GREATER; else if (result > 0) return LESS; else if (xl < yl) { if (xp) return EXTENSION; else return LESS; } else if (xl > yl) { if (yp) return PREFIX; else return GREATER; } else return EQUAL; } @| compare LESS GREATER EQUAL PREFIX EXTENSION @} @o names.c -cc -d @{char *save_string(char *s) { char *new = (char *) arena_getmem((strlen(s) + 1) * sizeof(char)); strcpy(new, s); return new; } @| save_string @} @o names.c -cc -d @{static int ambiguous_prefix(Name *node, char *spelling, unsigned char sector); static char * found_name = NULL; Name *prefix_add(Name **rt, char *spelling, unsigned char sector) { Name *node = *rt; int cmp; while (node) { switch ((cmp = compare(node->spelling, spelling))) { case GREATER: rt = &node->rlink; break; case LESS: rt = &node->llink; break; case EQUAL: found_name = node->spelling; case EXTENSION: if (node->sector > sector) { rt = &node->rlink; break; } else if (node->sector < sector) { rt = &node->llink; break; } if (cmp == EXTENSION) node->spelling = save_string(spelling); return node; case PREFIX: @ return node; } node = *rt; } @ } @| prefix_add @} Since a very short prefix might match more than one fragment name, I need to check for other matches to avoid mistakes. Basically, I simply continue the search down {\em both\/} branches of the tree. @d Check for ambiguous prefix @{{ if (ambiguous_prefix(node->llink, spelling, sector) || ambiguous_prefix(node->rlink, spelling, sector)) fprintf(stderr, "%s: ambiguous prefix %c<%s...%c> (%s, line %d)\n", command_name, nw_char, spelling, nw_char, source_name, source_line); }@} @o names.c -cc -d @{static int ambiguous_prefix(Name *node, char *spelling, unsigned char sector) { while (node) { switch (compare(node->spelling, spelling)) { case GREATER: node = node->rlink; break; case LESS: node = node->llink; break; case EXTENSION: case PREFIX: case EQUAL: if (node->sector > sector) { node = node->rlink; break; } else if (node->sector < sector) { node = node->llink; break; } return TRUE; } } return FALSE; } @} Rob Shillingsburg suggested that I organize the index of user-specified identifiers more traditionally; that is, not relying on strict {\small ASCII} comparisons via \verb|strcmp|. Ideally, we'd like to see the index ordered like this: \begin{quote} \begin{flushleft} aardvark \\ Adam \\ atom \\ Atomic \\ atoms \end{flushleft} \end{quote} The function \verb|robs_strcmp| implements the desired predicate. It returns -2 for alphabetically less-than, -1 for less-than but only differing in case, zero for equal, 1 for greater-than but only in case and 2 for alphabetically greater-than. @d Function prototypes @{extern int robs_strcmp(char*, char*); @} @o names.c -cc -d @{int robs_strcmp(char* x, char* y) { int cmp = 0; for (; *x && *y; x++, y++) { @ @ if (*x == *y) continue; if (islower(*x) && toupper(*x) == *y) { if (!cmp) cmp = 1; continue; } if (islower(*y) && *x == toupper(*y)) { if (!cmp) cmp = -1; continue; } return 2*(toupper(*x) - toupper(*y)); } if (*x) return 2; if (*y) return -2; return cmp; } @| robs_strcmp @} Certain character sequences are invisible when printed. We don't want them to be considered for the alphabetical ordering. @d Skip invisibles on @'p@' @{if (*@1 == '|') @1++; @} @o names.c -cc -d @{Name *name_add(Name **rt, char *spelling, unsigned char sector) { Name *node = *rt; while (node) { int result = robs_strcmp(node->spelling, spelling); if (result > 0) rt = &node->llink; else if (result < 0) rt = &node->rlink; else { found_name = node->spelling; if (node->sector > sector) rt = &node->llink; else if (node->sector < sector) rt = &node->rlink; else return node; } node = *rt; } @ } @| name_add @} @d Create new name... @{{ node = (Name *) arena_getmem(sizeof(Name)); if (found_name && robs_strcmp(found_name, spelling) == 0) node->spelling = found_name; else node->spelling = save_string(spelling); node->mark = FALSE; node->llink = NULL; node->rlink = NULL; node->uses = NULL; node->defs = NULL; node->arg[0] = node->arg[1] = node->arg[2] = node->arg[3] = node->arg[4] = node->arg[5] = node->arg[6] = node->arg[7] = node->arg[8] = NULL; node->tab_flag = TRUE; node->indent_flag = TRUE; node->debug_flag = FALSE; node->comment_flag = 0; node->sector = sector; *rt = node; return node; }@} Name terminated by whitespace. Also check for ``per-file'' flags. Keep skipping white space until we reach scrap. @o names.c -cc -d @{Name *collect_file_name(void) { Name *new_name; char name[MAX_NAME_LEN]; char *p = name; int start_line = source_line; int c = source_get(), c2; while (isspace(c)) c = source_get(); while (isgraph(c)) { *p++ = c; c = source_get(); } if (p == name) { fprintf(stderr, "%s: expected file name (%s, %d)\n", command_name, source_name, start_line); exit(-1); } *p = '\0'; /* File names are always global. */ new_name = name_add(&file_names, name, 0); @ c2 = source_get(); if (c != nw_char || (c2 != '{' && c2 != '(' && c2 != '[')) { fprintf(stderr, "%s: expected %c{, %c[, or %c( after file name (%s, %d)\n", command_name, nw_char, nw_char, nw_char, source_name, start_line); exit(-1); } return new_name; } @| collect_file_name @} @d Handle optional per-file flags @{{ while (1) { while (isspace(c)) c = source_get(); if (c == '-') { c = source_get(); do { switch (c) { case 't': new_name->tab_flag = FALSE; break; case 'd': new_name->debug_flag = TRUE; break; case 'i': new_name->indent_flag = FALSE; break; case 'c': @ break; default : fprintf(stderr, "%s: unexpected per-file flag (%s, %d)\n", command_name, source_name, source_line); break; } c = source_get(); } while (!isspace(c)); } else break; } }@} So far we only deal with C comments. @d Get comment delimiters @{c = source_get(); if (c == 'c') new_name->comment_flag = 1; else if (c == '+') new_name->comment_flag = 2; else if (c == 'p') new_name->comment_flag = 3; else fprintf(stderr, "%s: Unrecognised comment flag (%s, %d)\n", command_name, source_name, source_line); @} @d Forward declarations for scraps.c @{char * comment_begin[4] = { "", "/* ", "// ", "# "}; char * comment_mid[4] = { "", " * ", "// ", "# "}; char * comment_end[4] = { "", " */", "", ""}; @| comment_begin comment_end comment_mid @} Name terminated by \verb+\n+ or \verb+@@{+; but keep skipping until \verb+@@{+ @o names.c -cc -d @{Name *collect_macro_name(void) { char name[MAX_NAME_LEN]; char args[1000]; char * arg[9]; char * argp = args; int argc = 0; char *p = name; int start_line = source_line; int c = source_get(), c2; unsigned char sector = current_sector; if (c == '+') { sector = 0; c = source_get(); } while (isspace(c)) c = source_get(); while (c != EOF) { Name * node; switch (c) { case '\t': case ' ': *p++ = ' '; do c = source_get(); while (c == ' ' || c == '\t'); break; case '\n': @ default: if (c==nw_char) { @ break; } *p++ = c; c = source_get(); break; } } fprintf(stderr, "%s: expected fragment name (%s, %d)\n", command_name, source_name, start_line); exit(-1); return NULL; /* unreachable return to avoid warnings on some compilers */ } @| collect_macro_name @} @d Check for termina... @{{ c = source_get(); switch (c) { case '(': case '[': case '{': @ return install_args(node, argc, arg); case '\'': @ break; default: if (c==nw_char) { *p++ = c; break; } fprintf(stderr, "%s: unexpected %c%c in fragment definition name (%s, %d)\n", command_name, nw_char, c, source_name, start_line); exit(-1); } }@} @d Enter the next argument @{arg[argc] = argp; while ((c = source_get()) != EOF) { if (c==nw_char) { c2 = source_get(); if (c2=='\'') { @ c = source_get(); break; } else *argp++ = c2; } else *argp++ = c; } *p++ = ARG_CHR; @} @d Type dec... @{#define ARG_CHR '\001' @| ARG_CHR@} @d Make this argument @{if (argc < 9) { *argp++ = '\000'; argc += 1; } @} @d Cleanup and install name @{{ if (p > name && p[-1] == ' ') p--; if (p - name > 3 && p[-1] == '.' && p[-2] == '.' && p[-3] == '.') { p[-3] = ' '; p -= 2; } if (p == name || name[0] == ' ') { fprintf(stderr, "%s: empty name (%s, %d)\n", command_name, source_name, source_line); exit(-1); } *p = '\0'; node = prefix_add(¯o_names, name, sector); }@} @d Skip until scrap... @{{ do c = source_get(); while (isspace(c)); c2 = source_get(); if (c != nw_char || (c2 != '{' && c2 != '(' && c2 != '[')) { fprintf(stderr, "%s: expected %c{ after fragment name (%s, %d)\n", command_name, nw_char, source_name, start_line); exit(-1); } @ return install_args(node, argc, arg); }@} @d Function prototypes @{extern Name *install_args(Name * name, int argc, char *arg[0]); @} @o names.c -cc -d @{Name *install_args(Name * name, int argc, char *arg[9]) { int i; for (i = 0; i < argc; i++) { if (name->arg[i] == NULL) name->arg[i] = save_string(arg[i]); } return name; } @} @d Type declarations @{typedef struct arglist {Name * name; struct arglist * args; struct arglist * next; } Arglist; @| Arglist @} @o names.c -cc -d @{Arglist * buildArglist(Name * name, Arglist * a) { Arglist * args = (Arglist *)arena_getmem(sizeof(Arglist)); args->args = a; args->next = NULL; args->name = name; return args; } @| buildArglist @} Terminated by \verb+@@>+ @o names.c -cc -d @{Arglist * collect_scrap_name(int current_scrap) { char name[MAX_NAME_LEN]; char *p = name; int c = source_get(); unsigned char sector = current_sector; Arglist * head = NULL; Arglist ** tail = &head; if (c == '+') { sector = 0; c = source_get(); } while (c == ' ' || c == '\t') c = source_get(); while (c != EOF) { switch (c) { case '\t': case ' ': *p++ = ' '; do c = source_get(); while (c == ' ' || c == '\t'); break; default: if (c==nw_char) { @ break; } if (!isgraph(c)) { fprintf(stderr, "%s: unexpected character in fragment name (%s, %d)\n", command_name, source_name, source_line); exit(-1); } *p++ = c; c = source_get(); break; } } fprintf(stderr, "%s: unexpected end of file (%s, %d)\n", command_name, source_name, source_line); exit(-1); return NULL; /* unreachable return to avoid warnings on some compilers */ } @| collect_scrap_name @} @d Look for end of scrap name... @{{ Name * node; c = source_get(); switch (c) { case '\'': { @< Add plain string argument@> } *p++ = ARG_CHR; c = source_get(); break; case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': { @ } *p++ = ARG_CHR; c = source_get(); break; case '{': { @ } *p++ = ARG_CHR; c = source_get(); break; case '<': @ *p++ = ARG_CHR; c = source_get(); break; case '(': scrap_name_has_parameters = 1; @ return buildArglist(node, head); case '>': scrap_name_has_parameters = 0; @ return buildArglist(node, head); default: if (c==nw_char) { *p++ = c; c = source_get(); break; } fprintf(stderr, "%s: unexpected %c%c in fragment invocation name (%s, %d)\n", command_name, nw_char, c, source_name, source_line); exit(-1); } }@} A plain string argument has no name and a string as a value. @d Add plain string argument @{char buff[MAX_NAME_LEN]; char * s = buff; int c, c2; while ((c = source_get()) != EOF) { if (c==nw_char) { c2 = source_get(); if (c2=='\'') break; *s++ = c2; } else *s++ = c; } *s = '\000'; @@} A parameter argument propagated into an interior fragment has no name and and a string of \verb|ARG_CHAR| followed by a digit as value. @d Add a propagated argument @{char buff[3]; buff[0] = ARG_CHR; buff[1] = c; buff[2] = '\000'; @@} @d Add an inline scrap argument @{int s = collect_scrap(); Scrap_Node * d = (Scrap_Node *)arena_getmem(sizeof(Scrap_Node)); d->scrap = s; d->quoted = 0; d->next = NULL; *tail = buildArglist((Name *)1, (Arglist *)d); tail = &(*tail)->next;@} @d Type declarations @{typedef struct embed { Scrap_Node * defs; Arglist * args; } Embed_Node; @| Embed_Node@} @d Add macro call argument @{*tail = collect_scrap_name(current_scrap); if (current_scrap >= 0) add_to_use((*tail)->name, current_scrap); tail = &(*tail)->next; @} @d Add buff... @{*tail = buildArglist(NULL, (Arglist *)save_string(buff)); tail = &(*tail)->next; @} @o names.c -cc -d @{static Scrap_Node *reverse(Scrap_Node *a); /* a forward declaration */ void reverse_lists(Name *names) { while (names) { reverse_lists(names->llink); names->defs = reverse(names->defs); names->uses = reverse(names->uses); names = names->rlink; } } @| reverse_lists @} Just for fun, here's a non-recursive version of the traditional list reversal code. Note that it reverses the list in place; that is, it does no new allocations. @o names.c -cc -d @{static Scrap_Node *reverse(Scrap_Node *a) { if (a) { Scrap_Node *b = a->next; a->next = NULL; while (b) { Scrap_Node *c = b->next; b->next = a; a = b; b = c; } } return a; } @| reverse @} \section{Searching for Index Entries} \label{search} Given the array of scraps and a set of index entries, we need to search all the scraps for occurrences of each entry. The obvious approach to this problem would be quite expensive for large documents; however, there is an interesting paper describing an efficient solution~\cite{aho:75}. @o scraps.c -cc -d @{typedef struct name_node { struct name_node *next; Name *name; } Name_Node; @| Name_Node @} @o scraps.c -cc -d @{typedef struct goto_node { Name_Node *output; /* list of words ending in this state */ struct move_node *moves; /* list of possible moves */ struct goto_node *fail; /* and where to go when no move fits */ struct goto_node *next; /* next goto node with same depth */ } Goto_Node; @| Goto_Node @} @o scraps.c -cc -d @{typedef struct move_node { struct move_node *next; Goto_Node *state; char c; } Move_Node; @| Move_Node @} @o scraps.c -cc -d @{static Goto_Node *root[256]; static int max_depth; static Goto_Node **depths; @| root max_depth depths @} @o scraps.c -cc -d @{static Goto_Node *goto_lookup(char c, Goto_Node *g) { Move_Node *m = g->moves; while (m && m->c != c) m = m->next; if (m) return m->state; else return NULL; } @| goto_lookup @} \subsection{Retrieving scrap uses} @o scraps.c -cc -d @{typedef struct ArgMgr_s { char * pv; char * bgn; Arglist * arg; struct ArgMgr_s * old; } ArgMgr; @| ArgMgr @} @o scraps.c -cc -d @{typedef struct ArgManager_s { Manager * m; ArgMgr * a; } ArgManager; @| ArgManager @} @o scraps.c -cc -d @{static void pushArglist(ArgManager * mgr, Arglist * a) { ArgMgr * b = malloc(sizeof(ArgMgr)); if (b == NULL) { fprintf(stderr, "Can't allocate space for an argument manager\n"); exit(EXIT_FAILURE); } b->pv = b->bgn = NULL; b->arg = a; b->old = mgr->a; mgr->a = b; } @| pushArglist @} @o scraps.c -cc -d @{static char argpop(ArgManager * mgr) { while (mgr->a != NULL) { ArgMgr * a = mgr->a; @ @ @ } return (pop(mgr->m)); } @| argpop @} We separate individual arguments using spaces. @d Perhaps |return| a character from the current arg @{if (a->pv != NULL) { char c = *a->pv++; if (c != '\0') return c; a->pv = NULL; return ' '; } @} I'm pretty sure that only the first case is ever used. I don't think the others can occur in this context, which makes the whole |ArgManager| thing doubtful. @d Perhaps start a new arg @{if (a->arg) { Arglist * b = a->arg; a->arg = b->next; if (b->name == NULL) { a->bgn = a->pv = (char *)b->args; } else if (b->name == (Name *)1) { a->bgn = a->pv = "{Embedded Scrap}"; } else { pushArglist(mgr, b->args); }@} @d Otherwise pop the current arg @{} else { mgr->a = a->old; free(a); }@} @o scraps.c -cc -d @{static char prev_char(ArgManager * mgr, int n) { char c = '\0'; ArgMgr * a = mgr->a; Manager * m = mgr->m; if (a != NULL) { @ } else { @ } return c; } @| prev_char @} @d Get the nth previous character from an argument @{if (a->pv && a->pv - n >= a->bgn) c = *a->pv; else if (a->bgn) { int j = strlen(a->bgn) + 1; if (n >= j) c = a->bgn[j - n]; else c = ' '; } @} @d Get the nth previous character from a scrap @{int k = m->index - n - 2; if (k >= 0) c = m->scrap->chars[k]; else if (m->prev) c = m->prev->chars[SLAB_SIZE - k]; @} \subsection{Building the Automata} @d Function pro... @{extern void search(void); @} @o scraps.c -cc -d @{static void build_gotos(Name *tree); static int reject_match(Name *name, char post, ArgManager *reader); void search(void) { int i; for (i=0; i<128; i++) root[i] = NULL; max_depth = 10; depths = (Goto_Node **) arena_getmem(max_depth * sizeof(Goto_Node *)); for (i=0; i @ } @| search @} @o scraps.c -cc -d @{static void build_gotos(Name *tree) { while (tree) { @spelling|@> build_gotos(tree->rlink); tree = tree->llink; } } @| build_gotos @} @d Extend goto... @{{ int depth = 2; char *p = tree->spelling; char c = *p++; Goto_Node *q = root[(unsigned char)c]; Name_Node * last; if (!q) { q = (Goto_Node *) arena_getmem(sizeof(Goto_Node)); root[(unsigned char)c] = q; q->moves = NULL; q->fail = NULL; q->moves = NULL; q->output = NULL; q->next = depths[1]; depths[1] = q; } while ((c = *p++)) { Goto_Node *new = goto_lookup(c, q); if (!new) { Move_Node *new_move = (Move_Node *) arena_getmem(sizeof(Move_Node)); new = (Goto_Node *) arena_getmem(sizeof(Goto_Node)); new->moves = NULL; new->fail = NULL; new->moves = NULL; new->output = NULL; new_move->state = new; new_move->c = c; new_move->next = q->moves; q->moves = new_move; if (depth == max_depth) { int i; Goto_Node **new_depths = (Goto_Node **) arena_getmem(2*depth*sizeof(Goto_Node *)); max_depth = 2 * depth; for (i=0; inext = depths[depth]; depths[depth] = new; } q = new; depth++; } last = q->output; q->output = (Name_Node *) arena_getmem(sizeof(Name_Node)); q->output->next = last; q->output->name = tree; }@} @d Build failure functions @{{ int depth; for (depth=1; depthmoves; while (m) { char a = m->c; Goto_Node *s = m->state; Goto_Node *state = r->fail; while (state && !goto_lookup(a, state)) state = state->fail; if (state) s->fail = goto_lookup(a, state); else s->fail = root[(unsigned char)a]; if (s->fail) { Name_Node *p = s->fail->output; while (p) { Name_Node *q = (Name_Node *) arena_getmem(sizeof(Name_Node)); q->name = p->name; q->next = s->output; s->output = q; p = p->next; } } m = m->next; } r = r->next; } } }@} \subsection{Searching the Scraps} @d Search scraps @{{ for (i=1; ifail; if (state) state = goto_lookup(c, state); else state = root[(unsigned char)c]; @ @ @ last = c; c = argpop(&reader); if (state && state->output) { Name_Node *p = state->output; do { Name *name = p->name; if (!reject_match(name, c, &reader) && scrap_array(i).sector == name->sector && (!name->uses || name->uses->scrap != i)) { Scrap_Node *new_use = (Scrap_Node *) arena_getmem(sizeof(Scrap_Node)); new_use->scrap = i; new_use->next = name->uses; name->uses = new_use; if (!scrap_is_in(name->defs, i)) add_uses(&(scrap_array(i).uses), name); } p = p->next; } while (p); } } } }@} @d Skip over at at @{if (last == nw_char && c == nw_char) { last = '\0'; c = argpop(&reader); } @} @d Skip over a scrap use @{if (last == nw_char && c == '<') { char buf[MAX_NAME_LEN]; char * p = buf; Arglist * args; c = argpop(&reader); while ((c = argpop(&reader)) != nw_char) *p++ = c; c = argpop(&reader); *p = '\0'; if (sscanf(buf, "%p", &args) != 1) { fprintf(stderr, "%s: found an internal problem (3)\n", command_name); exit(-1); } pushArglist(&reader, args); }@} @d Forward declarations for scraps.c @{ static void add_uses(Uses **root, Name *name); static int scrap_is_in(Scrap_Node * list, int i); @} @o scraps.c -cc -d @{ static int scrap_is_in(Scrap_Node * list, int i) { while (list != NULL) { if (list->scrap == i) return TRUE; list = list->next; } return FALSE; } @| scrap_is_in@} @o scraps.c -cc -d @{ static void add_uses(Uses * * root, Name *name) { int cmp; Uses *p, **q = root; while ((p = *q, p != NULL) && (cmp = robs_strcmp(p->defn->spelling, name->spelling)) < 0) q = &(p->next); if (p == NULL || cmp > 0) { Uses *new = arena_getmem(sizeof(Uses)); new->next = p; new->defn = name; *q = new; } } @| add_uses @} @d Type dec... @{typedef struct uses { struct uses *next; Name *defn; } Uses; @| Uses @} @d Function prototypes @{extern void format_uses_refs(FILE *, int); @} @o scraps.c -cc -d @{ void format_uses_refs(FILE * tex_file, int scrap) { Uses * p = scrap_array(scrap).uses; if (p != NULL) @ } @| format_uses_refs @} @d Write uses ... @{{ char join = ' '; fputs("\\item \\NWtxtIdentsUsed\\nobreak\\", tex_file); do { @ join = ','; p = p->next; }while (p != NULL); fputs(".", tex_file); }@} @d Write one use reference @{Name * name = p->defn; Scrap_Node *defs = name->defs; int first = TRUE, page = -1; fprintf(tex_file, "%c \\verb%c%s%c\\nobreak\\ ", join, nw_char, name->spelling, nw_char); if (defs) { do { @ first = FALSE; defs = defs->next; }while (defs!= NULL); } else { fputs("\\NWnotglobal", tex_file); } @} @d Write one referenced scrap @{fputs("\\NWlink{nuweb", tex_file); write_scrap_ref(tex_file, defs->scrap, -1, &page); fputs("}{", tex_file); write_scrap_ref(tex_file, defs->scrap, first, &page); fputs("}", tex_file);@} @d Function prototypes @{extern void format_defs_refs(FILE *, int); @} @o scraps.c -cc -d @{ void format_defs_refs(FILE * tex_file, int scrap) { Uses * p = scrap_array(scrap).defs; if (p != NULL) @ } @| format_defs_refs @} @d Write defs ... @{{ char join = ' '; fputs("\\item \\NWtxtIdentsDefed\\nobreak\\", tex_file); do { @ join = ','; p = p->next; }while (p != NULL); fputs(".", tex_file); }@} @d Write one def reference @{Name * name = p->defn; Scrap_Node *defs = name->uses; int first = TRUE, page = -1; fprintf(tex_file, "%c \\verb%c%s%c\\nobreak\\ ", join, nw_char, name->spelling, nw_char); if (defs == NULL || (defs->scrap == scrap && defs->next == NULL)) { fputs("\\NWtxtIdentsNotUsed", tex_file); } else { do { if (defs->scrap != scrap) { @ first = FALSE; } defs = defs->next; }while (defs!= NULL); } @} \subsubsection{Rejecting Matches} A problem with simple substring matching is that the string ``he'' would match longer strings like ``she'' and ``her.'' Norman Ramsey suggested examining the characters occurring immediately before and after a match and rejecting the match if it appears to be part of a longer token. Of course, the concept of {\sl token\/} is language-dependent, so we may be occasionally mistaken. For the present, we'll consider the mechanism an experiment. @o scraps.c -cc -d @{#define sym_char(c) (isalnum(c) || (c) == '_') static int op_char(char c) { switch (c) { case '!': case '#': case '%': case '$': case '^': case '&': case '*': case '-': case '+': case '=': case '/': case '|': case '~': case '<': case '>': return TRUE; default: return c==nw_char ? TRUE : FALSE; } } @| sym_char op_char @} @o scraps.c -cc -d @{static int reject_match(Name *name, char post, ArgManager *reader) { int len = strlen(name->spelling); char first = name->spelling[0]; char last = name->spelling[len - 1]; char prev = prev_char(reader, len); if (sym_char(last) && sym_char(post)) return TRUE; if (sym_char(first) && sym_char(prev)) return TRUE; if (op_char(last) && op_char(post)) return TRUE; if (op_char(first) && op_char(prev)) return TRUE; return FALSE; /* Here is @xother@x */ } @| reject_match @} \section{Labels} Refer to @xlabel@x. And another one @xother@x. @d Get label from @{char label_name[MAX_NAME_LEN]; char * p = label_name; while (c = @1, c != nw_char) /* Here is @xlabel@x */ *p++ = c; *p = '\0'; c = @1; @} @o scraps.c -cc -d @{void write_label(char label_name[], FILE * file) @,@)@> @| write_label@} @d Function prototypes @{void write_label(char label_name[], FILE * file); @} @d Write the label to file @{write_single_scrap_ref(file, lbl->scrap); fprintf(file, "-%02d", lbl->seq);@} @d Complain about missing label @{fprintf(stderr, "Can't find label %s.\n", label_name);@} @d Save label to label store @{if (label_name[0]) @,@)@> else { @ }@} @d Create a new label en... @{lbl = (label_node *)arena_getmem(sizeof(label_node) + (p - label_name)); lbl->left = lbl->right = NULL; strcpy(lbl->name, label_name); lbl->scrap = current_scrap; lbl->seq = ++lblseq; *plbl = lbl;@} @d Complain about duplicate labels @{fprintf(stderr, "Duplicate label %s.\n", label_name);@} @d Complain about empty label @{fprintf(stderr, "Empty label.\n");@} @d Search for label(@'Found@',@'Notfound@') @{{ label_node * * plbl = &label_tab; for (;;) { label_node * lbl = *plbl; if (lbl) { int cmp = label_name[0] - lbl->name[0]; if (cmp == 0) cmp = strcmp(label_name + 1, lbl->name + 1); if (cmp < 0) plbl = &lbl->left; else if (cmp > 0) plbl = &lbl->right; else { @1 break; } } else { @2 break; } } } @} @o global.c -cc -d @{label_node * label_tab = NULL; @| label_tab@} @d Type declarations @{typedef struct l_node { struct l_node * left, * right; int scrap, seq; char name[1]; } label_node; @| label_node@} @d Global variable declarations @{extern label_node * label_tab; @} \section{Memory Management} \label{memory-management} I manage memory using a simple scheme inspired by Hanson's idea of {\em arenas\/}~\cite{hanson:90}. Basically, I allocate all the storage required when processing a source file (primarily for names and scraps) using calls to \verb|arena_getmem(n)|, where \verb|n| specifies the number of bytes to be allocated. When the storage is no longer required, the entire arena is freed with a single call to \verb|arena_free()|. Both operations are quite fast. @d Function p... @{extern void *arena_getmem(size_t n); extern void arena_free(void); @} @o arena.c -cc -d @{typedef struct chunk { struct chunk *next; char *limit; char *avail; } Chunk; @| Chunk @} We define an empty chunk called \verb|first|. The variable \verb|arena| points at the current chunk of memory; it's initially pointed at \verb|first|. As soon as some storage is required, a ``real'' chunk of memory will be allocated and attached to \verb|first->next|; storage will be allocated from the new chunk (and later chunks if necessary). @o arena.c -cc -d @{static Chunk first = { NULL, NULL, NULL }; static Chunk *arena = &first; @| first arena @} \subsection{Allocating Memory} The routine \verb|arena_getmem(n)| returns a pointer to (at least) \verb|n| bytes of memory. Note that \verb|n| is rounded up to ensure that returned pointers are always aligned. We align to the nearest 8-byte segment, since that'll satisfy the more common 2-byte and 4-byte alignment restrictions too. @o arena.c -cc -d @{void *arena_getmem(size_t n) { char *q; char *p = arena->avail; n = (n + 7) & ~7; /* ensuring alignment to 8 bytes */ q = p + n; if (q <= arena->limit) { arena->avail = q; return p; } @ } @| arena_getmem @} If the current chunk doesn't have adequate space (at least \verb|n| bytes) we examine the rest of the list of chunks (starting at \verb|arena->next|) looking for a chunk with adequate space. If \verb|n| is very large, we may not find it right away or we may not find a suitable chunk at all. @d Find a new chunk... @{{ Chunk *ap = arena; Chunk *np = ap->next; while (np) { char *v = sizeof(Chunk) + (char *) np; if (v + n <= np->limit) { np->avail = v + n; arena = np; return v; } ap = np; np = ap->next; } @ }@} If there isn't a suitable chunk of memory on the free list, then we need to allocate a new one. @d Allocate a new ch... @{{ size_t m = n + 10000; np = (Chunk *) malloc(m); np->limit = m + (char *) np; np->avail = n + sizeof(Chunk) + (char *) np; np->next = NULL; ap->next = np; arena = np; return sizeof(Chunk) + (char *) np; }@} \subsection{Freeing Memory} To free all the memory in the arena, we need only point \verb|arena| back to the first empty chunk. @o arena.c -cc -d @{void arena_free(void) { arena = &first; } @| arena_free @} \chapter{Man page} Here is the UNIX man page for nuweb: @O nuweb.1 @{.TH NUWEB 1 "local 3/22/95" .SH NAME Nuweb, a literate programming tool .SH SYNOPSIS .B nuweb .br \fBnuweb\fP [options] [file] ... .SH DESCRIPTION .I Nuweb is a literate programming tool like Knuth's .I WEB, only simpler. A .I nuweb file contains program source code interleaved with documentation. When .I nuweb is given a .I nuweb file, it writes the program file(s), and also produces, .I LaTeX source for typeset documentation. .SH COMMAND LINE OPTIONS .br \fB-t\fP Suppresses generation of the {\tt .tex} file. .br \fB-o\fP Suppresses generation of the output files. .br \fB-d\fP List dangling identifier references in indexes. .br \fB-c\fP Forces output files to overwrite old files of the same name without comparing for equality first. .br \fB-v\fP The verbose flag. Forces output of progress reports. .br \fB-n\fP Forces sequential numbering of scraps (instead of page numbers). .br \fB-s\fP Doesn't print list of scraps making up file at end of each scrap. \fB-p path\fP Prepend path to the filenames for all the output files. \fB-V string\fP Provide the string for replacement of the @@v operation. This is intended as a means for including version information in generated output. \fB-x\fP Include cross-references in comments in output files. \fB-h options\fP Turn on hyperlinks using the hyperref package of LaTeX and provide the options to the package. \fB-r\fP Turn on hyperlinks using the hyperref package of LaTeX, with the package options being in the text. \fB-I path\fP Provide a directory to search for included files. This may appear several times. .SH FORMAT OF NUWEB FILES A .I nuweb file contains mostly ordinary .I LaTeX. The file is read and copied to output (.tex file) unless a .I nuweb command is encountered. All .I nuweb commands start with an ``at-sign'' (@@). Files and fragments are defined with the following commands: .PP @@o \fIfile-name flags scrap\fP where scrap is smaller than one page. .br @@O \fIfile-name flags scrap\fP where scrap is bigger than one page. .br @@d \fIfragment-name scrap\fP. Where scrap is smallar than one page. .br @@D \fIfragment-name scrap\fP. Where scrap is bigger than one page. .PP @@q \fIfragment-name scrap\fP. Where scrap is smallar than one page. The scrap is not expanded in the output, allowing you to construct output files which can, perhaps after further processing, be input to .I nuweb. .br @@Q \fIfragment-name scrap\fP. Where scrap is bigger than one page. Likewise. .PP Scraps have specific begin and end markers; which begin and end marker you use determines how the scrap will be typeset in the .tex file: .br \fB@@{\fP...\fB@@}\fP for verbatim "terminal style" formatting or, with the -l flag, LaTeX listing package. .br \fB@@[\fP...\fB@@]\fP for LaTeX paragraph mode formatting, and .br \fB@@(\fP...\fB@@)\fP for LaTeX math mode formmating. .br Any amount of whitespace (including carriage returns) may appear between a name and the begining of a scrap. .PP Several code/file scraps may have the same name; .I nuweb concatenates their definitions to produce a single scrap. Code scrap definitions are like macro definitions; .I nuweb extracts a program by expanding one scrap. The definition of that scrap contains references to other scraps, which are themselves expanded, and so on. \fINuweb\fP's output is readable; it preserves the indentation of expanded scraps with respect to the scraps in which they appear. .PP .SH PER FILE OPTIONS When defining an output file, the programmer has the option of using flags to control the output. .PP \fB-d\fR option, .I Nuweb will emit line number indications at scrap boundaries. .br \fB-i\fR option, .I Nuweb supresses the indentation of fragments (useful for \fBFortran\fR). .br \fB-t\fP option makes \fInuweb\fP copy tabs untouched from input to output. .br \fB-c\fIx\fP Include comments in the output file. \fIx\fP may be \fBc\fP for C-style comments, \fB+\fP for C++ and \fBp\fP for perl and similar. .PP .SH MINOR COMMANDS .br @@@@ Causes a single ``at-sign'' to be copied into the output. .br @@\_ Causes the text between it and the next {\tt @@\_} to be made bold (for keywords, etc.) in the formatted document .br @@% Comments out a line so that it doesn't appear in the output. .br @@i \fBfilename\fR causes the file named to be included. .br @@f Creates an index of output files. .br @@m Creates an index of fragments. .br @@u Creates an index of user-specified identifiers. .PP To mark an identifier for inclusion in the index, it must be mentioned at the end of the scrap it was defined in. The line starts with @@| and ends with the \fBend of scrap\fP mark \fB@@}\fP. .PP .SH ERROR MESSAGES .PP .SH BUGS .PP .SH AUTHOR Preston Briggs. Internet address \fBpreston@@cs.rice.edu\fP. .SH MAINTAINER Simon Wright Internet address \fBsimon@@pushface.org\fP .br Keith Harwood Internet address \fBKeith.Harwood@@vitalmis.com\fP @} \chapter{Indices} \label{indices} Three sets of indices can be created automatically: an index of file names, an index of fragment names, and an index of user-specified identifiers. An index entry includes the name of the entry, where it was defined, and where it was referenced. \section{Files} @f \section{Fragments} @m \section{Identifiers} Knuth prints his index of identifiers in a two-column format. I could force this automatically by emitting the \verb|\twocolumn| command; but this has the side effect of forcing a new page. Therefore, it seems better to leave it this up to the user. @u \fi \bibliographystyle{amsplain} \bibliography{litprog,master,misc} \end{document}