Rd;Ruml Roclet;
and provide graphs and diagrams.A conceivable timeline is as follows:
- May 26–June 1
- Parse R objects and extract relevant data
- June 1–July 7
- Parse Roxygen tags
- July 7–July 20
- Generate intermediate representation
- July 20–August 11
- Translate
Rd, namespaces and collations from intermediate representation
Although R should be parsed using internal procedures like
parse(), formals(), etc.; Roxygen
blocks require a formal grammar. The following EBNF
representation needs to be refined to include composite
elements:
body = [ [ brief description, ] detailed description, ] { elements };
brief description = { escaped text }, newline;
detailed description = { escaped text }, newline, newline;
element = { simple element | demarcated element | list element |
composite element};
simple element = tag symbol, keyword, { escaped text };
demarcated element = tag symbol, demarcated keyword, { text },
tag symbol, "end", demarcated keyword;
list element = tag symbol, list keyword, { items },
tag symbol, "end", list keyword;
item = tag symbol, "item", { escaped text };
composite element = table | function | displayed function;
keyword = "name" | "alias" | "title" | "brief" | "usage" | "param" |
"details" | "return" | "reference" | "note" | "attention" |
"author" | "sa" | "see" | "example" | "keyword" | "source" |
"n" | "section" | "e" | "em" | "b" | "squote" | "dquote" |
"kbd" | "samp" | "package" | "file" | "email" | "url" |
"var" | "env" | "option" | "command" | "dfn" | "cite" |
"acronym" | "ref" | "R" | "dots" | "ldots" | "export" |
"import" | "include" | "enc" | "concept" | "encoding" |
"tab";
demarcated keyword = "code" | "verbatim" | "f" | "df";
list keyword = "enumeration" | "itemize" | "describe";
table = tag symbol, "table", { row }, tag symbol, "endtable";
row = { text, [field delimiter] }, row delimeter;
field delimeter = tag symbol, "tab";
row delimeter = tag symbol, "n";
function = tag symbol, "f", { text }, tag symbol, "endf";
displayed function = tag symbol, "df", { text }, tag symbol, "enddf";
tag symbol = "@";
escaped tag symbol = "\@";
text = ? UTF-8 visible characters ?;
escaped text = text - tag symbol | escaped tag symbol;
Rd keyword Roxygen equivalent name name† alias alias* title title* description brief usage usage* arguments param details details value return reference reference* note note, attention author author seealso sa, see examples example keyword keyword* docType n/a format n/a source source* S4method n/a cr n section section emph e, em strong b bold b sQuote squote* dQuote dquote* code code preformatted verbatim kbd kbd* samp samp* pkg pkg* file file* email* url url* var var† env env options options* command command* dfn dfn* cite cite* acronym acronym* itemize itemize* enumerate enumerate* item item* describe describe* tabular tabular* link ref linkS4class n/a eqn f* deqn df* R R* enc enc* concept concept* encoding encoding* export export* import import* include include* slot slot* prototye prototye*
* New keyword not found in
Doxygen.
† Keyword
exists in Doxygen, but with different semantics.
S-expressions are readily parsible and less verbose than their XML counterpart, without sacrificing readability. We propose, therefore, something like the following for an intermediate parse-tree representation:
(class (name "person")
(slot (name "fullname")
(description "The full name of the person"))
(slot (name "birthyear")
(description "The year of birth"))
(prototype "Prototype person is named John Doe, 1971"))
The above is merely an example; the intermediate representation should be extensible and tied intimately to the grammar.
Peter Danenberg <pcd at wikitex dot org>