cortav specification

cortav is a markup language designed to be a simpler, well-specified, and more capable alternative to markdown. its name derives from the Ranuir words cor “writing” and tav “document”, translating to something like “(plain) text document”.

the cortav format can be called cortavgil, or gil cortavi, to differentiate it from the reference implementation cortavsir or sir cortavi.

cortav vs. markdown

the most important difference between cortav and markdown is that cortav is strictly line-oriented. this choice was made to ensure that cortav was relatively easy to parse. so while a simple .ct file may look a bit like a .md file, in reality it’s a lot closer to gemtext than any flavor of markdown.

however, the differences go much deeper. the most distinctive feature of cortav is that its syntax is strongly recursive. with markdown, you can apply at most one styling to any given block or span or text. with cortav, you can nest as many styles as you like, and you can style text in places markdown wouldn’t ordinarily let you: within headings, inside link text, even in code listings if you absolutely insist (this needs to be turned on by a special directive before the listing in question, however).

this manual describes cortav exhaustively, but if you just want a quick reference on how markdown translates to cortav, look no further.

headings: cortav uses almost the same syntax for headings that markdown does, except it only allows the “ATX style” headings, with one or more hash characters at the start of the line. the only differences from markdown are:
- you can use the unicode section character § instead of # if you’re feeling snobby
- you must put a space between the control sequence (the sequence of hashes or section symbols, in this case) and the title text. # title creates a section with the heading text “title”, but #title creates a new section with no heading at all; instead, it gives the anonymous section the ID title. and of course, you can combine the two: #ttl title creates a section with the heading text “title” and the ID ttl. what are IDs for? we’ll get to that in a little bit
paragraphs are mostly the same as in markdown, except that a paragraph break occurs after every newline character, not every blank line. paragraphs can be indented by however many spaces you like; such indentation will be ignored. (tabs have a special meaning, however). in cortav, you can also explicitly mark a line of text as a paragraph by preceding it with a period character (.), which is useful if you want to start a paragraph with text that would otherwise be interpreted specially.
italic text — or rather, emphasized text — is written as [!my spiffy italic text]. in cortav, these spans can be nested within other spans (or titles, or table cells, or…), and the starting and ending point is unambiguous.
bold text — or rather, strong text — is written as [*my commanding bold text].
bold-italic text — or rather, overemphasized text — has no specific notation. rather, you create it by nesting one span within the other, for instance: [*[!my ostentatious bold-italic text]].
links are quite different from their markdown equivalents. cortav does not have inline links, as it is intended to be easily readable in both formatted and plain-text format, and long URLs rather disrupt the flow of reading. rather, a link tag is written with the notation [>nifty-link my nifty link], where the word nifty-link immediately following the arrow is an identifier indicating the destination of the link. (instead of a greater-than sign, you can also use the unicode arrow symbol →.) if the identifier is the same as one you’ve assigned to a document object, such as a section, cortav produces a link within the document to that object. otherwise, it will look for a reference (or failing that, a resource) to tell it the URI for the link. if nothing in the document matches the ID, an error will result and compilation will be aborted. (a reference is a key-value pair created by adding a line like nifty-link: https://zombo.com indented by exactly one tab. you can place this reference anywhere you like so long as it’s in the same section; if you want to name a reference in another section, you have to prefix it with that section’s ID, e.g. [>spiffy-section.nifty-link my nifty link declared in a spiffy section].)
lists use a different syntax from markdown. you can start a line with a * to create an unordered list, or : to create an ordered list; indentation doesn’t matter. if you want to nest list items, instead of putting two spaces before the child item, you just add another star or colon. and of course, you can nest lists of different kinds within one another.
horizontal rules use roughly the same syntax: three or more hyphens on a line of their own (---). underlines also work (___, -_-, __-__-__ etc), as do horizontal unicode box drawing characters (─ ━ ┈ etc).
some markdown implementations support tables. cortav does too, using a very simple notation similar to the usual notation used in markdown. a key difference, however, is that cortav table cells can contain any formatting a paragraph can.
underlines are supported by some markdown implementations. in cortav, you can apply them with the notation [_my underlined text] — please just use them sparingly when you render to HTML!
strikethrough is supported by some extended versions of markdown. cortav uses the notation [~my deleted text], with the intended semantics of text that is being removed by some revision of a document. (you can also denote text that is being added by using a plus sign instead of a tilde)
images are a bit more complicated, but much more versatile. see the section on resources for an explanation.
smart quotes and em dashes are inserted automatically, just as in markdown, provided you have the transmogrify extension available. (it is part of the reference implementation and defined by the spec, but not required.) in fact, you can insert longer dashes than em dashes just by increasing the number of hyphens. the reference implementation’s transmogrifier also translates ascii arrows like --> into their unicode equivalents (→).
literals (also known as code text) can be inserted with the [`int main(void);] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you’re quoting makes use of such forbidden runes. in this case, you’ll want to wrap the code span in a raw span. the syntax for this is [`[\int main(void);]], but since this is a bit of an unwieldy syntax for a common operation, it can also be abbreviated as ["int main(void);].

of course, this is only a small taste of what cortav can do, not even touching on key features like macros, footnotes, or equation formatting. read the sections on blocks and spans for all the gory details.

encoding

a cortav document is made up of a sequence of codepoints. UTF-8 must be supported, but other encodings (such as UTF-32 or C6B) may be supported as well. lines will be derived by splitting the codepoints at the linefeed character or equivalent. note that unearthly encodings like C6B or EBCDIC will need to select their own control sequences.

file type

a cortav source file is identified using a file extension, file type, and/or magic byte sequence.

three file extensions are defined as identifying a cortav source file. where relevant, all must be recognized as indicating a cortav source file.

ct is the shorthand extension
cortav is the canonical disambiguation extension, for use in circumstances where *.ct is already defined to mean a different file format.
 is the canonical Corran extension, a byte sequence comprising the unicode codepoints U+E3CE U+E3BD. where the filesystem in question does not specify a filename encoding, the bytes should be expressed in UTF-8.

three more extensions are reserved for identifying a cortav intent file.

ctc is the shorthand extension
cortavcun is the canonical disambiguation extension
 is the canonical Corran extension, a byte sequence comprising the unicode codepoints U+E3CE U+E3BD U+E3CE. where the filesystem in question does not specify a filename encoding, the bytes should be expressed in UTF-8.

on systems which use metadata to encode filetype, two values are defined to identify cortav source files

text/x.cortav should be used when strings or arbitrary byte sequences are supported
CTAV (that is, the byte sequence 0x43 54 41 56) should be used on systems that support only 32-bit file types/4-character type codes like Classic Mac OS.

two more values are defined to identify cortav intent files.

text/x.cortav-intent
CTVC (the byte sequence 0x43 54 56 43)

on systems which do not define a canonical way of encoding the filetype but support extended attributes of some kind, such as linux, an attribute named mime may be created and given the value text/x.cortav or text/x.cortav-intent; alternatively, extensions may be used.

it is also possible to indicate the nature of a cortav file without using filesystem metadata. this is done by prefixing the file with a magic byte sequence. the sequence used depends on the encoding. currently, only sequences for UTF-8 and ASCII are defined, as these are the only encodings supported by the reference implementation. in the event that other implementations add support for other encodings, their sequences will be standardized here.

for UTF-8 and ASCII plain text files, %ct\n (that is, the byte sequence 0x25 63 74 0A) should be used

consequently, this sequence should be ignored by a cortav parser at the start of a file (except as an indication of file format).

for FreeDesktop-based systems, the build/velartrill-cortav.xml file included in the repository supplies mappings for the extensions and magic byte sequences. a script is also included which can be registered with xdg-open so that double-clicking on a cortav file will render it out and open it in your default web browser. $ make install will generate the necessary FreeDesktop XML files and register them, as well as install the script and the cortav executable itself. for more information see building the reference implementation.

levels

not all of cortav’s features make sense in every context. for this reason, cortav defines N levels of compliance. for example, a social media platform that enables simple paragraph styling and linking using cortav syntax may claim to be “cortav level 1 compliant”. every level N is a strict superset of level N−1.

level 1: styling. simple inline formatting sequences like strong, emphatic, literal, links, etc. math equation styling need not be supported. paragraphs, lists, and references are the only block elements supported. suitable for styling tweets and other very short content.
level 2: layout. implements header, paragraph, newline, directive, and reference block elements. supports resources at least for remote or attached images. suitable for longer social media posts.
level 3: publishing. implements all currently standardized core behavior, including zero or more extensions.
level 4: reference. implements all currently standardized behavior, including all standardized extensions.

structure (block elements)

cortav is based on an HTML-like block model, where a document consists of sections, which are made up of blocks, which may contain a sequence of spans. flows of text are automatically conjoined into spans, and blocks are separated by one or more newlines. this means that, unlike in markdown, a single logical paragraph cannot span multiple ASCII lines. the primary purpose of this was to ensure ease of parsing, but also, both markdown and cortav are supposed to be readable from within a plain text editor. this is the 21st century. every reasonable text editor supports soft word wrap, and if yours doesn’t, that’s entirely your own damn fault. hard-wrapping lines is incredibly user-hostile, especially to users on mobile devices with small screens. cortav does not allow it.

the first character(s) of every line (the “control sequence”) indicates the role of that line. if no control sequence is recognized, the line is treated as a paragraph. the currently supported control sequences are listed below. some control sequences have alternate forms, in order to support modern, readable unicode characters as well as plain ascii text.

paragraphs (. ¶ ❡): a paragraph is a simple block of text. the period control sequence is only necessary if the paragraph text starts with text that would be interpreted as a control sequence otherwise
newlines \: inserts a line break into previous paragraph and attaches the following text. mostly useful for poetry or lyrics
section starts # §: starts a new section. all sections have an associated depth, determined by the number of sequence repetitions (e.g. “###” indicates depth three). sections may have headers and IDs; both are optional. IDs, if present, are a sequence of raw-text immediately following the hash marks. if the line has one or more space character followed by styled-text, a header will be attached. the character immediately following the hashes can specify a particular type of section. e.g.:
- # is a simple section break.
- #anchor opens a new section with the ID anchor.
- # header opens a new section with the title “header”.
- #anchor header opens a new section with both the ID anchor and the title “header”.
nonprinting sections (^): sometimes, you’ll want to create a namespace without actually adding a visible new section to the document. you can achieve this by creating a nonprinting section and defining resources within it. nonprinting sections can also be used to store comments, notes, to-dos, or other meta-information that is useful to have in the source file without it becoming a part of the output. nonprinting sections can be used for a sort of “literate markup,” where resource and reference definitions can intermingle with human-readable narrative about those definitions. note that unlike comments, nonprinting sections are still parsed and can still affect other sections by means of definitions and pragmata.
resource (@): defines a resource. a resource is a file or object that is to be embedded in the document somehow. common examples of resources include images, videos, iframes, or headers/footers. resources can be defined inline, or reference external objects that are read in either at compile-time or view-time. see resources for more information.
lists (* :): these are like paragraph nodes, but list nodes that occur next to each other will be arranged so as to show they compose a sequence. depth is determined by the number of stars/colons. like headers, a list entry may have an ID that can be used to refer back to it; it is indicated in the same way. if colons are used, this indicates that the order of the items is signifiant. :-lists and *-lists may be intermixed; however, note than only the last character in the sequence actually controls the type. a blank line terminates the current list.
directives (%): a directive issues a hint to the renderer in the form of an arbitrary string. directives are normally ignored if they are not supported, but you may cause a warning to be emitted where the directive is not supported with %! or mark a directive critical with %!! so that rendering will entirely fail if it cannot be obeyed.
comments (%%): a comment is a line of text that is simply ignored by the renderer.
asides (!): indicates text that diverges from the narrative, and can be skipped without interrupting it. think of it like block-level parentheses. asides which follow one another are merged as paragraphs of the same aside, usually represented as a sort of box. if the first line of an aside contains a colon, the stretch of styled-text from the beginning to the aside to the colon will be treated as a “type heading,” e.g. “Warning:”
code (~~~): a line beginning with ~~~ begins or terminates a block of code. code blocks are by default not parsed, but parsing can be activated by preceding the code block with an %expand directive. the opening line should look like one of the below
- ~~~
- ~~~ language (markdown-style shorthand syntax)
- ~~~ [language] ~~~ (cortav syntax)
- ~~~ [language] #id ~~~
- ~~~ title ~~~
- ~~~ title [language] ~~~
- ~~~ [language] title ~~~
- ~~~ title [language] #id ~~~
definition (tab¹): a line beginning with a tab² is a multipurpose metadata syntax. the tab may be followed by an identifier, a colon, and a value string, in which case it opens a new definition; alternatively, a second tab character turns the line into a definition continuation, adding the remaining characters as a new line to the definition value on the previous line. when a new definition is opened on a line immediately following certain kinds of objects, such as resources, embeds, or multiline macro expansions, it attaches key-value metadata to that object. when a definition is not preceded by such an object, an independent reference is created instad.
- a reference is a general mechanism for out-of-line metadata, and references are used in many different ways — e.g. to specify link destinations, footnote contents, abbreviations, or macro bodies. to ensure that a definition is interpreted as a reference, rather than as metadata for an object, precede it with a blank line.
quotation (<): a line of the form <name> quote denotes an utterance by name.
blockquote (>id body): “inline” blockquote syntax. can be nested by repeating the > character. the id is optional, but the > character must be immediately followed by whitespace if the block is not to have an ID.
subtitle/caption (--): attaches a subtitle to the previous header, or caption to the previous object. after a blockquote, attaches an attribution line
embed (&): embeds a referenced object. can be used to show images or repeat previously defined objects like lists or tables, optionally with a caption. an embed line can be followed immediately by a sequence of definitions in the same way that resource definitions can, to override resource properties on a per-instance basis. note that only presentation-related properties like desc can be meaningful overridden, as embed does not trigger a re-render of the parse tree; if you want to override e.g. context variables, use a multiline macro invocation instead.
- &image embeds an image or other block-level object. image can be a reference with a url or file path, or it can be an embed section (e.g. for SVG files)
  - &myimg All that remained of the unfortunate blood magic pageant contestants and audience (police photo)
- &-ident styled-text embeds a closed disclosure element containing the text of the named object (a nonprinting section or cortav resource should usually be used to store the content; it can also name an image or video, of course). in interactive outputs, this will display as a block which can be clicked on to view the full contents of the referenced object ident; if styled-text is present, it overrides the title of the section you are embedding (if any). in static outputs, the disclosure object will display as an enclosed box with styled-text as the title text
  - &-ex-a Prosecution Exhibit A (GRAPHIC CONTENT)
- &+section styled-text is like the above, but the disclosure element is open by default
$macro arg1|arg2|argn… invokes a block-level macro with the supplied arguments, and can be followed by a property override definition list the same way embed and resource lines can. note that while both $id and &id can be used to instantiate resources of type text/x.cortav, there is a critical difference: $id renders out the sub-document separately each time it is named, allowing for parameter expansion and for context variables to be overridden for each invocation. by contrast, &id can only insert copies of the same render; no parameters can be passed and context variables will be expanded to their value at the time the resource was defined. only &id can instantiate resources of types other than text/x.cortav. there is also a semantic distinction: resources interpreted as macros are inserted “in-band”, on an equal basis with nearby elements; resources interpreted as embeds are set off to clearly indicate that they are a sub-document, and on interactive outputs may have their own independently-scrolling viewport.
horizontal rule (---): inserts a horizontal rule or other context break; does not end the section. must be followed by newline. underlines can also be used in place of dashes (___, -_-, __-__-__ etc), as can horizontal unicode box drawing characters (─ ━ ┈ etc).
page break (^^): for formats that support pagination, like EPUB or HTML (when printed), indicates that the rest of the current page should be blank. for formats that do not, extra margins will be inserted. does not create a new section
page rule (^-^): inserts a page break for formats that support them, and a horizontal rule for formats that do not. does not create a new section. comprised of any number of horizontal rule characters surrounded by a pair of carets (e.g. ^-^ ^_^ ^⸻^ ^__—^ ^┈┈┈┈┈^)
table cells (+ |): see table examples.
equations (=): block-level equations can be inserted with the = sequence
cross-references (=> ⇒): inserts a block-level link. has two forms for the sake of gemtext compatibility. styled-text is a descriptive text of the destination. especially useful for menus and gemtext output.
- the cortav syntax is =>ident styled-text, where ident is an identifier; links to the same destination as [>ident styled-text] would
- the compatibility syntax is => uri styled-text (note the space before uri!). instead of taking an identifier for an object in the document, it directly accepts a URI. note that this is not formally equivalent to gemtext’s link syntax, which also allows paths in place of URIs; cortav does not. the gemtext line => /somewhere would need to be expressed as => file:/somewhere, and => /somewhere?key=val as http:/somewhere?key=val (or gemini:/somewhere?key=val, if the result is to be served over a gemini server).
empty lines (that is, lines consisting of nothing but whitespace) constitute a break, which terminates multiline objects that do not have a dedicated termination sequence, for example lists and asides.

styled text (span elements)

most blocks contain a sequence of spans. these spans are produced by interpreting a stream of styled-text following the control sequence. styled-text is a sequence of codepoints potentially interspersed with escapes. an escape is formed by an open square bracket [ followed by a span control sequence, and arguments for that sequence like more styled-text. escapes can be nested.

strong [*styled-text]: causes its text to stand out from the narrative, generally rendered as bold or a brighter color.
emphatic [!styled-text]: indicates that its text should be spoken with emphasis, generally rendered as italics
custom style [.id styled-text]: applies a specially defined font style. for example, if you have defined caution to mean “demibold italic underline”, cortav will try to apply the proper weight and styling within the constraints of the current font to the span styled-text. see the fonts section for more information about this mechanism.
literal [`styled-text]: indicates that its text is a reference to a literal sequence of characters or other discrete token. generally rendered in monospace
variable [$styled-text]: indicates to the reader that its text is a placeholder, rather than a literal representation. generally rendered in italic monospace, ideally of a different color
underline [_styled-text]: underlines the text. use sparingly on text intended for webpages — underlined text is distinct from links, but underlining non-links is still a violation of convention.
strikeout [~styled-text]: indicates that its text should be struck through or otherwise indicated for deletion
insertion [+styled-text]: indicates that its text should be indicated as a new addition to the text body.
- consider using a macro definition edit: [~[#1]][+[#2]] to save typing if you are doing editing work
link [>ref styled-text]: produces a hyperlink or cross-reference denoted by ref, which may be either a URL specified with a reference or the name of an object like an image or section elsewhere in the document. the unicode characters → and 🔗 can also be used instead of > to denote a link.
footnote [^ref styled-text]: annotates the text with a defined footnote. in interactive output media [^citations.qtheo Quantum Theosophy: A Neophyte's Catechism] will insert a link with the text Quantum Theosophy: A Neophyte’s Catechism that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found. ref can be the ID of a reference, in which case the reference value is parsed as cortav markup to form the body of the footnote; it can also be the ID of a resource, which can be of any MIME type compatible with the current renderer, as as text/x.cortav, text/plain, or image/png.
superscript ['styled-text]
subscript [,styled-text]
raw [\ raw-text]: causes all characters within to be interpreted literally, without expansion. the only special characters are square brackets, which must have a matching closing bracket, and backslashes.
raw literal [“raw-text]: shorthand for a raw inside a literal, that is [`[\…]]
macro {name arguments}: invokes a macro inline, specified with a reference. if the result of macro expansion contains newlines, they will be treated as line breaks, rather than paragraph breaks as they would be in a multiline context.
argument [#var]: in macros only, inserts the var-th argument. otherwise, inserts a context variable provided by the renderer.
raw argument [##var]: like above, but does not evaluate var.
term [&name], [&name expansion]: quotes a defined term with a link to its definition, optionally with a custom expansion of the term (for instance, to expand the first use of an acronym)
inline image [&@name]: shows a small image or other object inline. the unicode character 🖼 can also be used instead of &@.
unicode codepoint [Uhex-integer]: inserts an arbitrary UCS codepoint in the output, specified by hex-integer. lowercase u is also legal, as are U+ and u+.
math mode [=equation]: activates additional transformations on the span to format it as a mathematical equation; e.g. * becomes × and / → ÷.
extension [%ext …]: invokes extension named in ext. ext will usually be an extension name followed by a symbol (often a period) and then an extension-specific directive, although for some simple extensions it may just be the plain extension name. further syntax and semantics depend on the extension. this syntax can also be used to apply formatting specific to certain renderers, such as assigning a CSS class in the html renderer ([%html.myclass my [!styled] text]).
important extension [%!ext …]: like extension, but will issue a warning if the requested extension is not available
critical extension [%!!ext …]: like important extension, but will trigger an error and abort compilation if the requested extension is not available
extension text [%:ext styled-text]: like extension, but when the requested extension is not present, styled-text wlil be emitted as-is. this is a better way to apply CSS classes, as the text will still be visible when rendered to formats other than HTML.
inline comment [%%...]: ignored. useful for editorial annotations not intended to be part of the rendered product.

tables

tables are encoded using a very simple notation. any line that begins with a plus + or bar | denotes a table row. each plus or bar separates one column from the other: a plus opens a new header cell, a bar opens a new normal cell.

the alignment of a cell can be specified by placing colons at one edge or both edges of the given cell. a colon on the left (|: my table cell |) indicates a left-aligned cell, a colon on right a right-aligned cell (| my table cell :|), and a colon on both sides a centered cell (|: my table cell :|). if you want to use a special character without it being eaten by the table parser, just put a backslash in from of it, e.g. | this cell | contains a pipe + a plus sign and ends with a colon :|. and of course, table cells are just normal spans — they can contain any other kind of span formatting you like, such as links, emphasis, or footnotes.

no other features (like colspans or rowspans) are currently part of the spec but they will be added eventually (if i can figure out a decent way to implement them without creating a huge mess).

you can finish each row with a bar or plus character, but it’s not necessary. only do it if you think it makes the source easier to read.

an example of table notation

identifiers

an identifier is a string which unambiguously names a section, block, reference, or other object of interest. every section has its own identifier namespace; to reference an object in one section from a different section, the identifier must be written as sec.obj, where sec is the ID of the containing section and obj is the ID of the object one wishes to reference. subdocuments (such as blockquotes or resources of type text/x.cortav) have their own namespace collection, so an object defined within e.g. a blockquote will not escape to the enclosing context; however, subdocuments can reference objects from the containing document in the usual fashion.

identifiers can be composed through interpolation in macro expansions. for instance, the macro expansion

xref: (see [>link-[#1] [#2]] by [#3])

the 25,953CE accession of the Hyperion Entity to the Throne Unyielding is now widely considered by the collective of ascended masters to have been fraudulent {xref disc-artax|Discursus Immundus on the Immaterial Doctrines of Redemption & Liquidation|Hierophant Artaxerxes MXIV}, but at the time was received with the near-unanimous adulation of the Manifold Hierophanies. an early dissenting voice, the Kakistarch Philomene Adumbratio of Forbidden Zone 969,

is equivalent to

the 25,953CE accession of the Hyperion Entity to the Throne Unyielding is now widely considered by the collective of ascended masters to have been fraudulent (see [>link-disc-artax Discursus Immundus on the Immaterial Doctrines of Redemption & Liquidation] by Hierophant Artaxerxes MXIV), but at the time was received with the near-unanimous adulation of the Manifold Hierophanies. an early dissenting voice, the Kakistarch Philomene Adumbratio of Forbidden Zone 969,

identifiers dereferenced through macro expansions which lack an explicit section prefix are first evaluated in the context of the section in which the macro was defined, rather than the section in which it was expanded. the latter is only searched if the definition section has no object with a matching identifier. this behavior, while useful, is not always desirable. to force the resulting identifier (whether composed through interpolation or written out explicitly) to be evaluated in the context of the macro expansion, prefix it with a period (.) to form an expansion-site identifier. for example:

#alpha section alpha

link: http://example.net

macro-plain-id: [>link link to example.net]

macro-expsite-id: [>.link link to section-dependent destination]

here are links to example.net:

* {macro-plain-id}

* {macro-expsite-id}

* {beta.macro-expsite-id}

here are links to both sites:

* {macro-plain-id} [%% example.net]

* {beta.macro-plain-id} [%% zombo com]

#beta section beta

link: http://zombo.com

macro-plain-id: [>link link to zombo com]

macro-expsite-id: [>.link link to some website somewhere]

here are links to zombo com:

* {macro-plain-id}

* {macro-expsite-id}

* {alpha.macro-expsite-id}

here are links to both sites:

* {macro-plain-id} [%% zombo com]

* {alpha.macro-plain-id} [%% example.net]

resources

a resource represents content that is not encoded directly into the source file, but which is embedded by some means in the output. resources can either be embedded, in which case they are compiled into the final document itself, or they can be linked, in which case the final document only contains a URI or similar tag referencing the resource. not all render backends support both linking and embedding embedding, nor do all backends support all object types (for instance, groff does not support video embedding.)

a resource definition is begun by line consisting of an @ sign and an identifier. this line is followed by any number of parameters. a parameter is a line beginning with a single tab, a keyword, a colon, and a then a value. additional lines can be added to a parameter by following it with a line that consists of two tabs followed by the text you wish to add. (this is the same syntax used by references.) a resource definition is terminated by a break, or any line that does not begin with a tab

a resource definition in use looks like this:

this is a demonstration of resources

@smiley

src: link image/webp http://cdn.example.net/img/smile.webp

link image/png file:img/smile.png

embed image/gif file:img/smile.gif

desc: the Smiling Man would like to see you in his office

here is the resource in span context [&smiley]

and here it is in block context:

&smiley

rendered as HTML, this might produce the following:

<style>

.res-smiley {

content: image-set(

url(http://cdn.example.net/img/smile.webp) type(image/webp),

url(img/smile.png) type(image/png),

url(data:image/gif;base64,/* … */) type(image/gif)

); /* this will actually be repeated with a -webkit- prefix */

}

</style>

<p>this is a demonstration of resources</p>

<p>here is the resource in span context: <span class="res-smiley"></span></p>

<p>and here it is in block context:</p>

note that empty elements with CSS classes are used in the output, to avoid repeating long image definitions (especially base64 inline encoded ones!)

in the opening line of a resource declaration, the identifier can be omitted. in this case, rather than registering a new resource in the current section, the resource will be inserted as a block at the position where it is defined. this is a shorthand for resources that are only used once. for resources used inline or multiple times, an identifier must be defined.

as an additional shortcut, a resource with only one source file can place its source specification on the opening line, separated by whitespace from the opening sequence. this means that the following are identical.

@smiley link image/webp file:smile.webp

desc: the Smiling Man would like to see you in his office

@smiley

src: link image/webp file:smile.webp

desc: the Smiling Man would like to see you in his office

this can be combined with ID omission for a very concise block-level image syntax.

@smiley

src: link image/webp file:smile.webp

&smiley

%% is the same as

@smiley link image/webp file:smile.webp

&smiley

%% is the same as

@ link image/webp file:smile.webp

%% is almost the same as

@ image/webp;base64 (*

%% inhuman gurgling in textual form

%% (except that the last wil require embedding)

inline resources are defined a bit differently:

@smiling-man-business-card text/plain {

THE SMILING MAN | tel. 0-Ω00-666█

if you can read this | email: nameless@smiles.gov

it is already too late | address: right behind you

}

@smiling-man-business-card image/png;base64 {

%% incomprehensible gibbering redacted

}

for an inline resource, the identifier is followed by a MIME type and an opening bracket. the opening bracket may be any of the characters { [ ( <, and can optionally be followed by additional characters to help disambiguate the closing bracket. the closing bracket is determined by “flipping” the opening bracket, producing bracket pairs like the following:

{::}
<!-- --!>
(*<>*)
<><> (disables nesting!)

if the open and closing brackets are distinguishable, they will nest appropriately, meaning that {} alone is very likely to be a safe choice to escape a syntactically correct C program (that doesn’t abuse macros too badly). brackets are searched for during parsing; encoded resources are not decoded until a later stage, so a closing bracket character in a base64-encoded text file cannot break out of its escaping.

as a convenience, if the first line of the resource definition begins with a single tab, one tab will be dropped from every following line in order to allow legible indentation. similarly, if an opening bracket is followed immediately by a newline, this newline is discarded.

text within a resource definition body is not expanded unless the resource definition is preceded with an %expand directive or the resource MIME type is text/x.cortav. if an expand directive is found, the MIME type will be used to try and determine an appropriate type of formatting, potentially invoking a separate renderer. for example, text/html will invoke the html backend, and application/x-troff will invoke the groff backend. if no suitable renderer is available, expansions will generate only plain text.

two suffixes are accepted: ;base64 and ;hex. the former will decode the presented strings using the base64 algorithm to obtain the resource’s data; the second will ignore all characters but ASCII hexadecimal digits and derive the resource data byte-by-byte by reading in hexadecimal pairs. for instance, the following sections are equivalent:

@propaganda text/plain {

WORLDGOV SAYS

“don't waste time with unproductive thoughts

your wages will be docked accordingly”

}

@propaganda text/plain;hex {

574f 524c 4447 4f56 2053 4159 530a e280 9c64 6f6e 2774 2077 6173

7465 2074 696d 6520 7769 7468 2075 6e70 726f 6475 6374 6976 6520

7468 6f75 6768 7473 0a20 796f 7572 2077 6167 6573 2077 696c 6c20

6265 2064 6f63 6b65 6420 6163 636f 7264 696e 676c 79e2 809d 0a

}

@propaganda text/plain;base64 {

V09STERHT1YgU0FZUwrigJxkb24ndCB3YXN0ZSB0aW1lIHdpdGggdW5wcm9kdWN0aXZlIHRob3Vn

aHRzCiB5b3VyIHdhZ2VzIHdpbGwgYmUgZG9ja2VkIGFjY29yZGluZ2x54oCdCg==

}

inline resources can also offer a cleaner syntax for complex multiline macros.

@def text/x.cortav {

* [*[#1]] [!([#2])]

*: [#3]

}

&def nuclear bunker|n|that which will not protect you from the Smiling Man

to make this usage simpler, resources with a type of text/x.cortav can omit the MIME type field.

inline resources are a great way to extend cortav with implementation-dependent features. say you want mathtex in your cortav renderer — all you have to do is support a new MIME type text/x.mathtex, and then the users can embed their math equations like so:

and as we see from the value of κ below, Bose-Fleischer-Kincaid entities of Carlyle subtype γ do not interact at all with the putative "Higgs field" of Athabaskan Windchime Theory, seemingly ruling out any distortion of the spacetime metric, and consequently removing the maximal density parameter that is defined for bosonic matter.

@ text/x.mathtex {>

%% divide subtract differentiate blah blah blah i don't know math

given the selective cross-interaction of γ-BFKs, we conclude that, under the prevailing cosmocelestial paradigm, the answer to the age-old question of how many angels can dance on the head of a pin is [^assump "as many as would like to"]

assump: assuming a perfectly spherical angel in a vacuum

supported parameters

src (all): specifies where to find the file, what it is, and how to embed it. each line of src should consist of two whitespace-separated words: MIME type and URI. the specification can also be prefixed with an extra word, auto, link, or embed, to control how the resource will be referenced from the output file.
- reference mode: the optional first word; if the requested reference mode is not applicable or valid for the output format or URI given, the source line will be skipped over.
  - embed: loads the resource at build time and embeds it into the output file. not all implementations may allow loading remote network resources at build time.
  - link: only embeds a reference to the location of the resource. use this for e.g. live iframes, dynamic images, or images hosted by a CDN.
  - auto: embeds a reference in file formats where that’s practical, and use a remote reference otherwise. auto is the default if the first word is omitted.
- MIME types: which file types are supported depends on the individual implementation and renderer backend; additionally, extensions can add support for extra types. MIME-types that have no available handler will, where possible, result in an attachment that can be extracted by the user, usually by clicking on a link. however, the following should be usable with all compliant implementations
  - image/* (graphical outputs only)
  - video/* (interactive outputs only)
  - image/svg+xml is handled specially for HTML files, and may or may not be compatible with other renderer backends.
  - font/* can be used with the HTML backend to reference a web font
  - font/woff2 can be used with the HTML backend to reference a web font
  - text/plain (will be inserted as a preformatted text block)
  - text/css (can be used when producing HTML files to link in an extra stylesheet, either by embedding it or referencing it from the header)
  - text/x.cortav (will be parsed and inserted as a formatted text block; context variables can be passed to the file by setting .var properties on the resource, e.g. .recipient-name: Mr. Winthrop)
  - application/x-troff can be used to supply sections of text written in raw groff syntax. these are ignored by other renderers.
  - text/html can be used to supply sections of text written in raw HTML. these are ignored by non-HTML outputs.
  - any MIME-type that matches the type of file being generated by the renderer can be used to include a block of data that will be passed directly to the renderer.
- URI types: additional URI types can be added by extensions or different implementations, but every compliant implementation must support these URIs.
  - http, https/http+tls: accesses resources over HTTP. add a file fallback if possible for the benefit of renderers/viewers that do not have internet access abilities.
  - file: references local files. (the meaning of “local” varies depending on the translation format.) absolute paths should begin file:/; the slash should be omitted for relative paths. note that this doesn’t have quite the same meaning as in HTML — file can (and usually should be) used with HTML outputs to refer to resources that reside on the same server. a cortav URI of file:/etc/passwd will actually result in the link /etc/passwd, not file:///etc/passwd when converted to HTML. generally, you only should use http when you’re referring to a resource that exists on a different domain. on systems where text and binary files are handled differently, the URIs file+txt: and file+bin: can be used to specify an opening mode.
  - asset: identical to file file, except that paths are interpreted relative to the asset base (the parent directory of the source file if not otherwise defined), rather than the current working directory of the cortav translator process.
  - name: a special URI used generally for referencing resources that are already installed on a target system and do not need to be embedded or linked, the name and type are enough for a renderer on another machine to locate the correct resource. this is useful mostly for fonts, where it’s more typical to refer to fonts that are installed on your system rather than providing paths to font files.
  - gemini: accesses resources over the gemini protocol. currently you should really only use this for embed resources unless you’re using the gemtext renderer backend, since nothing but gemini browsers are liable to support this protocol.
  - role: specifies an abstract resource determined by context, e.g. role:backdrop, role:body-font. for use by translators to formats which make provisions for viewer control. a role URI is special in that it is never embedded; it always depends on context — user preferences, environment variables, system stylesheets, what have you — at the time the output file is viewed, rather than the time of the input file being rendered.
desc: supplies a narrative description of the resources, for use as an “alt-text” when the image cannot be loaded and for screenreaders.
detail: supplies extra narrative commentary that is displayed contextually, e.g. when the user hovers her mouse cursor over the embedded object. also used for desc if desc is not supplied.

note that in certain cases, full MIME types do not need to be used. say you’re defining a font with the name URI — you can’t necessary know what file type the system fonts on another computer are going to be. in this case, you can just write font instead of font/ttf or font/woff2 or similar. all cortav needs to know in this case is what abstract kind of object you’re referencing. groff fonts (referenced with the dit URI) don’t have a specific MIME type either.

context variables

context variables are provided so that cortav renderers can process templates. certain context variables are provided for by the standard. you can test for the presence of a context variable with the directive %when ctx var. context variables are accessed with the [#name] span.

cortav.file: the name of the file currently being rendered
cortav.path: the absolute path of the file currently being rendered
cortav.time: the current system time in the form 19:02:31
cortav.date: the current system date in the form Friday 15 March 2024
cortav.datetime: the current system date and time represented in the locale or system-standard manner (e.g. Fri Mar 15 19:02:31 2024)
cortav.page: the number of the page currently being rendered
cortav.id: the identifier of the renderer
cortav.hash: the SHA3 hash of the source file being rendered

on systems with environment variables, these may be accessed as context variables by prefixing their name with env..

different renderers may provide context in different ways, such as from command line options or a context file. any predefined variables should carry an appropriate prefix to prevent conflation.

fonts

for output backends that support font specification, cortav provides a sophisticated font management system by means of the font stack.

when a document parse begins, the font stack is empty (unless a default font has already been loaded by an intent file).

when the font stack is empty, cortav does not include font specifications in its output, and thus will use whatever the default of the various rendering programs is.

to use fonts, we first have to define the fonts as resources.

%% first, we create a new nonprinting section to namespace the fonts

^fonts

we then define each font as a resource:

@serif

src: auto font name:Alegreya

embed font/ttf file:project-fonts/alegreya.ttf

link font/woff2 file:/assets/font/alegreya.woff2

auto font name:Times New Roman

auto font dit:TR/bold=TRB/italic=TRI/bold,italic=TRBI

@sans

src: link font name:Alegreya Sans

link font name:Open Sans

link font name:sans-serif

here we have defined two font families, fonts.serif and fonts.sans. each contains a list of references to fonts which will be tried in order. for example, this could be translated into the following CSS:

@font-face {

font-family: "fontdef-serif";

src: local("Alegreya"),

url("data:font/ttf;base64,…") format("font/ttf"),

url("/assets/font/alegreya.woff2") format("font/woff2"),

local("Times New Roman");

}

@font-face {

font-family: "fontdef-sans";

src: local("Alegreya Sans"),

local("Open Sans"),

local("sans-serif");

}

there are two things that aren’t super clear from the CSS, however. notice how we used auto on a couple of those specs? this means it’s up to the renderer to decide whether to link or embed the font. in HTML, a font specified by name can’t really be embedded, but for some translation formats such as PDF or PostScript, a system font can be selected by name and then embedded into the output. auto lets us produce valid HTML while still taking advantage of font embedding in other formats.

now that we have our font families defined, we can use their identifiers with the %font directive to control the font stack. the first thing we need to do is push a new font context, as the stack starts out empty. there’s two ways we can do this:

%font dup will create a copy of the current font context, allowing us to make some changes and then revert later with the %font pop command. this isn’t useful in our case, however, because right now the stack is empty; there’s nothing to duplicate.
%font new will create a brand new empty context for us to work with and push it to the stack. this can also be used to temporarily revert to the system default fonts, and then switch back with %font pop.
%font set changes one or more entries in the current font context. it can take a space-separated list of arguments in the form entry=font-id. the supported entries are:
- body: the fallback font. if only this is set in a given font context, it will be used for everything
- paragraph: the font used for normal paragraphs
- header: the font used in headers
- subtitle: the font used in subtitles
- list: the font used in lists
- table: the font used in tables
- caption: the font used for captions
%font pop removes the top context from the font stack.

note that extensions may consult the font context for their entries specific to them. for instance, toc checks for toc before falling back to body and then the default font.

these commands are enough to give us a very flexible setup. consider the following:

%% let's pretend we've also defined the fonts 'title', 'cursive', and 'thin'

%font new

%font set body=sans header=serif

%font dup

%font header=title

# WorldGov announcement

%font pop

%% we've now set up a default font context, created a new context for the title of the

%% document, and then popped it back off after the title was inserted so that our

%% first font context is active again. everything after that last '%font pop' will

%% be printed in sans, except for headers, which will be printed in 'serif'

WorldGov would like to congratulate 2274's Employee of the Year, [*The Smiling Man]! The Smiling Man had a few words of encouragement for the weary proles of the world when he graciously accepted his award at this year's ceremonial bloodletting:

%font dup

%font set body=cursive

> It is very important for you to understand that your dreams are the intellectual property of the WorldGov organization.

> Laborers who fail more than one duplicity check per workcycle will receive extra Pit Time.

%font pop

%% above we created a blockquote whose text is printed in a cursive font; afterwards,

%% we simply remove this new context, and everything is back the way it was at "WorldGov would like"

In addition to his 227th consecutive Employee of the Year Award, The Smiling Man has been nominated for a WorldGov Lifetime Achievement Award by the Hyperion Entity in recognition of his exceptional leadership in the Department Which Has No Name. Chief Ritual Officer Mr. Winthrop had this to say:

%% the font mechanism is at its most powerful when used with multiline macros:

cursive-quote: %font dup

%font set body=cursive

> [#1]

%font pop

%% now, whenever we want a block with a cursive body, we can simply invoke

$cursive-quote A sea of blood yet lies between us and the Destination. It won't impede me. And I'm so very proud to say that, apparently, it won't impede the Smiling Man either, if the Svalbard contract was any indication! [pause for laughter]

%% without affecting the overall font context. in fact, since 'cursive-quote' creates

%% its context using 'dup', it would import all font specifications besides 'body'

%% from the environment it is invoked in

you may have noticed the rather odd bit at the end of our font definition, with the dit URI. the reasons for this are tragic. groff, while delightful, has a thoroughly antiquated understanding of fonts, and doesn’t support normal font formats like truetype. groff ships with a limited number of fonts in its own format, identified by obscurantist letter code (HBI is “Helvetica Bold Italic”, for instance) and lacking normal metadata. for this reason, you’ll have to tell cortav how you want your fonts translated.

it is possible to use modern fonts with groff, but to do that you’ll have to convert and install them, which is outside the scope of this document. however, even if you do this, you should specify a fallback font (if possible) so that people rendering your document on other machines still get somewhat sensible output.

the syntax of a dit specification is dit:regular, where regular specifies the name of the regular font. this can be followed by any number of variant specifications /variant=name, where variant is one of the tags described in the custom style section, and name is the name of a DIT font. so the URI in the example names a font T with bold TB, italic TI, and bold-italic TBI.

the groff backend does do a little magic to make this mess more bearable, however. some of groff’s built-in fonts can be accessed by a name URI instead of having to construct them by hand with a dit URI — the backend hardcodes metadata for these fonts so that documents can render somewhat intellgibly in groff even if the original author did not make special provisions for this. the groff fonts accessible by name are:

Times New Roman
Helvetica
Courier
Bookman

additionally, as a shortcut, if the regular, bold, italic, and bold-italic variants of a DIT font have the predictable pattern of X, XB, XI, XBI (which many do), you can simply write the URI dit:X and cortav will infer the rest. so the example above could be rewritten as dit:T to exactly the same effect.

custom styles

sometimes you want to be able to issue more specific formatting instructions than “italic” or “bold”. cortav provides a simple custom style mechanism to allow this. a custom style is simply a reference that binds a name to a sequence of space-separated formatting directives. these directives include:

regular: applies the regulat form of the font, overriding any previous set styles
medium: applies a weight of the font between regular and bold, defaulting to regular if one is not available
demibold: applies a weight of the font between regular and bold, defaulting to bold if one is not available
bold: applies the usual “bold” weight of the font
dense: applies the heaviest available weight of th the font. usually the same as bold
light: applies the usual “light” weight of the font. most fonts do not have a light weight, so this will be the same as regular.
thin: applies the slimmest available weight of the font. usually the same as light.
underline: underlines the text
strike: strikes the text out
italic: applies a slanted variant of the font
oblique: applies the most slanted variant of the font available. usually the same as italic
font=id: switches to font id for the duration of the span. id must be the ID of a resource defining a font.
ext.prop=word: attaches extra information for use by formatting extensions. ext must be the ID of the extension.

once a custom style is defined, you can make use of it using the [.id styled-text] span notation, where id is the identifier of the reference containing your style. for instance, to define and use a style named important that specifies a dense, underlined variant of font impact³ and applies the CSS class blink when rendered with the html backend:

this paragraph contains some [.important truly important] information.

important: dense underline font=impact html.class=blink

you should always give your styles semantic names where practicable, instead of simply describing their graphical characteristics. this is good practice in general, but especially because your document will be renderable to different formats with different characteristics, and what makes text look important on a manpage in the terminal may be quite different from how it looks in a webpage or PDF.

directives

%author encodes document authorship. multiple author directives can be issued to add additional coauthors
%cols specifies the number of columns the next object should be rendered with
%include transcludes another file (but see also resources)
%with imports symbols from another scope:
- %with section imports all symbols in section
- %with section.object imports object from section
- %with name=section creates a local alias name for section
- %with name=section.object imports object from section under the name name
%global exports all symbols in the current section so they can be used unprefixed from any other section
- %global section exports all symbols in section
- %global section.object exports object from section
- %global name=section creates a global alias name for section
- %global name=section.object exports object from section under the name name
%quote transcludes another file, without expanding the text except for paragraphs
%expand causes the next object (usually a code block) to be fully expanded when it would otherwise not be
%font controls the font stack, for outputs that support changing fonts. see fonts for more information.
%lang changes the current language, which is used by extensions to e.g. control typographical conventions, and may be encoded into the output by certain renderers (e.g. HTML). note that quotes and blockquotes can be set to a separate language with a simpler syntax. the language should be notated using IETF language tags
- %lang is x-ranuir-Cent-CR8 sets the current language to Ranuir as spoken in the Central Worlds, written in Corran and encoded using C6B+U8L (which can also be interpreted as UTF-8, albeit with some lost semantics). this might be used at the top of a document to set its primary language.
- %lang push gsw-u-sd-chzh temporarily switches to Zürich German, e.g. to quote a German passage in an otherwise Ranuir document
- %lang sec en-US switches to American English for the duration of a section. does not affect the language stack.
- %lang pop drops the current language off the language stack, returning to whatever was pushed or set before it. this would be used, for instance, at the end of a passage
%pragma supplies semantic data about author intent, the kind of information the document contains and hints about how it should be displayed to the user. think of them like offhand remarks to the renderer — there’s no guarantee that it’ll pay any attention, but if it does, your document will look better. pragmata have no scope; they affect the entire document. the pragma function exists primarily as a means to allow parameters that would normally need to be specified on e.g. the command line to be encoded in the document instead in a way that multiple implementations can understand. a few standard pragmata are defined.
- %pragma layout gives a hint on how the document should be layed out. the first hint that is understood will be applied; all others will be discarded. standard hints include:
  - essay
  - narrative
  - screenplay: uses asides to denote actions, quotes for dialogue
  - stageplay: uses asides to denote actions, quotes for dialogue
  - manual
  - glossary
  - news
  - book: section depths 1-3 gain additional semantics
    1. part: the section gets a page to itself to announce the beginning of a new part or appendix. the first part is treated as the title page.
    2. chapter: the section is preceded by a page break
    3. heading: the section can occur on the same page as text and headings from other sections
- %pragma accent specifies an accent hue (in degrees around the color wheel) for renderers which support colorized output
- %pragma accent-spread is a factor that controls the “spread” of hues used in the document. if 0, only the accent color will be used; if larger, other hues will be used in addition to the primary accent color.
- %pragma dark-on-light on|off controls whether the color scheme used should be light-on-dark or dark-on-light
- %pragma page-width indicates how wide the pages should be
- %pragma title-page specifies a section to use as a title page, for renderer backends that support pagination

note on pragmata: particularly when working with collections of documents, you should not keep shared formatting metadata duplicated across the documents themselves! the best thing to do is to have a makefile for compiling the documents using whatever tools you want to support, and encoding the rendering options in this file (for the reference implementation this currently means as command line arguments, but eventually it will support intent files as well) so they can all be changed in one place; pragmata should instead be used for per-document overrides of default settings.

a workaround for the lack of intent files in the reference implementation is to have a single pseudo-stylesheet that contains only %pragma statements, and then import this file from each individual source file using the %include directive. this is suboptimal and recommended only when you need to ensure compatibility between different implementations.

when creating HTML files, an even better alternative may be to turn off style generation entirely and link in an external, hand-written CSS stylesheet. this is generally the way you should compile sources for existing websites if you aren't going to write your own extension.

examples

blockquotes

the following excerpts of text were recovered from a partially erased hard drive found in the Hawthorne manor in the weeks after the Incident. context is unknown.

> —spoke to the man under the bridge again, the one who likes to bite the heads off the fish, and he suggested i take a brief sabbatical and journey to the Wandering Oak (where all paths meet) in search of inspiration and the forsaken sword of Pirate Queen Granuaile. a capital idea! i shall depart upon the morrow, having honored the Lord Odin and poisoned my accursed minstrels as is tradition—

> —can't smell my soul anymore, but that's beside the point entirely—

> —that second moon (always have wondered why nobody else seems to notice the damn fool thing except on Michaelmas day). alas, my luck did not endure, and i was soon to find myself knee-deep in—

> —just have to see about that, won't we!—

the nearest surviving relative of Lord Hawthorne is believed to be a wandering beggar with a small slow loris for a pet who sells cursed wooden trinkets to unwary children. she will not be contacted, as the officers of the Yard fear her.

links & notes

this sentence contains a [>zombo link] to zombo com. you can do anything[^any] at zombo com.

zombo: https://zombo.com

any: anything you want

macros

the ranuir word {gloss cor|writing}…

gloss: [*[#1]] “[#2]”

$def sur|n|socialism

$def par|n|speech

def: * [*[#1]] [!([#2])]

** [#3]

%% equivalent to

@def {

* [*[#1]] [!([#2])]

** [#3]

}

$def sur|n|socialism

$def par|n|speech

%% we could even do the same thing abusing context variables

@def {

* [*[#word]] [!([#pos])]

** [#meaning]

}

$def

.word: sur

.pos: n

.meaning: socialism

$def

.word: par

.pos: n

.meaning: speech

%% context variables are useful because they inherit from the enclosing context

%% thus, we can exploit resource syntax to create templates with default values

@agent {

+ CODENAME :| [#1]

+ CIVILIAN IDENTITY :| [#civil]

+ RULES of ENGAGEMENT :| [#roe]

+ DANGER LEVEL :| [#danger]

}

.civil: (unknown)

.roe: Monitor; do not engage

.danger: (unknown)

$agent ZUCCHINI PARABLE

.civil: Zephram "Rolodex" Goldberg

.danger: Category Scarlet

$agent RHADAMANTH EXCISE

.roe: Eliminate with extreme prejudice; CBRN deployment authorized

.danger: [*Unquantifiable]

tables

here is a glossary table.

+ english :+ ranuir + zia ţai + thaliste +

and now the other way around!

+:english :| honor |

+:ranuir :| tef |

+:zia ţai :| pang |

+:thalishte:| mbecheve |

extensions

the cortav specification also specifies a number of extensions that do not have to be supported for a renderer to be compliant. the extension mechanism supports the following directives.

inhibits: prevents an extension from being used even where available
uses: turns on an extension that is not specified by the user operating the renderer (e.g. on the command line)
needs: causes rendering to fail with an error if the extensions are not available

where possible, instead of needs x y z, the directive when has-ext x y z should be used instead. this causes the next section to be rendered only if the named extensions are available. unless has-ext x y z can be used to provide an alternative format.

extensions are mainly interacted with through directives. all extension directives must be prefixed with the name of the extension.

the reference implementation seeks to support all standardized extensions. it’s not quite there yet, however.

sections that have a title will be included in the table of contents. the table of contents is by default inserted at the break between the first level-1 section and the section immediately following it. you may instead place the directive toc where you wish the TOC to be inserted, or suppress it entirely with inhibits toc. note that some renderers may not display the TOC as part of the document itself.

toc provides the directives:

%toc: insert a table of contents in the specified position. this can be used more than once, but doing so may have confusing, incorrect, or nonsensical results under some renderers, and some may just ignore the directive entirely
%toc mark styled-text: inserts a TOC entry with the label styled-text pointing to the current location. this can be used to e.g. mark noteworthy images, instances of long quotes or literal blocks, or functions inside an expanded code block.
%toc name id styled-text: like %toc mark but allows an additional id parameter which specifies the ID the renderer will assign to an anchor element. this is not meaningful for all renderers and when it is, it is up to the renderer to decide what it means.
- the html render backend interprets id as the id element for the anchor tag
- the groff render backend ignores id

transmogrify

a cortav renderer may automatically translate punctuation marks or symbol sequences to superior representations depending on their context. to be compliant this extension should implement, at minimum:

smart quotes (with consideration for the typographical conventions languages like German or Spanish)
- %transmogrify language lang can be used to explicitly set the language; otherwise, it must be determined from the value of %pragma lang. if this is not present, implementations may fall back on their own methods for determining the language in use, such as command-line flags.
multigraph to glyph conversion, including at least:
- -- → “—”
- --> → “→”
- <-- → “←”

an escape character before any of the sequence characters should prevent the sequence from being rendered. raw nodes (that is, [\…] and ["…]) should not be scanned for transmogrification, nor should the contents of code blocks unless marked with the %expand directive

transmogrification shall only take place after all other parsing steps are completed.

hilite

code can be highlighted according to the formal language it is written in. a compliant hilite implementation must implement basic keyword, symbol, comment, pragma, and literal highlighing for the following formal languages.

C⁴
Bourne Shell
DOS INI
GLSL
gmake and bmake
Lua
Fennel
Terra
Scheme
SQL (including PL/SQL)
libconfig
groff
HTML
CSS
cortav

the highlighter should make use of semantic HTML tags like <var> where possible.

lua

renderers with a lua interpreter available can evaluate lua code:

%lua use file: evaluates file and makes its definitions available
[%lua raw script]: evaluates script and emits the string it returns (if any) in raw span context.
[%lua exp script]: evaluates script and emits the string it returns (if any) in expanded span context.
%lua raw script: evaluates script and emits the string array it returns (if any) in raw block context.
%lua exp script: evaluates script and emits the string array it returns (if any) in expanded block context.

the interpreter should provide a cortav table with the objects:

ctx: contains context variables

used files should return a table with the following members

macros: an array of functions that return strings or arrays of strings when invoked. these will be injected into the global macro namespace.

ts

the ts extension allows documents to be marked up for basic classification constraints and automatically redacted. if you are seriously relying on ts for confidentiality, make damn sure you start the file with %!!needs ts, so that rendering will fail with an error if the extension isn’t supported.

ts currently has no support for misinformation.

ts enables the directives:

%ts class scope level (styled-text): indicates a classification level for either the whole document (scope doc) or the next section (scope sec). if the ts level is below level, the section will be redacted or rendering will fail with an error, as appropriate. if styled-text is included, this will be treated as the name of the classification level.
%ts word scope word (styled-text): indicates a codeword clearance that must be present for the text to render. if styled-text is present, this will be used to render the name of the codeword instead of word.
%when ts level level
%when ts word word

ts enables the spans:

[🔒#level styled-text]: redacts the span if the security level is below that specified.
[🔒.word styled-text]: redacts the span if the specified codeword clearance is not enabled.

(the padlock emoji is shorthand for %ts.)

ts redacts spans securely; that is, they are simply replaced with an indicator that they have been redacted, without visually leaking the length of the redacted text. redacted sections are simply omitted.

example

%ts word doc sorrowful-pines SORROWFUL PINES

# intercept R1440 TCT S3

this communication between the ambassador of [*POLITY DOORMAT CRIMSON] "Socialist League world Glory" and an unknown noble of [*POLITY ROSE] "the Empire of a Thousand Suns" was intercepted by [*SYSTEM SUPINE WARBLE].

## involved individuals

* (A) [*DOORMAT CRIMSON] Ambassador [🔒.morose-frenzy Hyacinth Autumn-Lotus] (confidence 1.0)

* (B) [*ROSE] Duchess [!UNKNOWN] (confidence 0.4)

## provenance

this communication was retrieved by [🔒#3 automated buoy downlink] from [*SYSTEM SUPINE WARBLE].

%ts level sec 9 ULTRAVIOLET

##> transcript

<A> we may have a problem

<B> Hyacinth, I told you not to contact me without—

<A, shouting> god DAMMIT woman I am trying to SAVE your worthless skin

<B> Hyacinth! your Godforsaken scrambler!

<A> …oh, [!fuck].

(signal lost)

specification license

the text of this specification is made available under the terms of the Creative Commons CC-BY-NC-SA 4.0 license. the binding license text may be found in the cortav source control tree at the following paths:

language	license text location
english	`legal/cc-by-nc-sa.en`
german	`legal/cc-by-nc-sa.de`

should the texts be interpreted to conflict in translation, the most restrictive subset of terms shall apply.

reference implementation

the cortav standard is implemented in cortav.lua, found in this repository. only the way cortav.lua interprets the cortav language is defined as a reference implementation; other behaviors are simply how cortav.lua implements the specification and may be copied, ignored, tweaked, violently assaulted, or used as inspiration by a compliant parser.

the reference implementation can be used both as a lua library and from the command line. cortav.lua contains the parser and renderers, ext/* contain various extensions, sirsem.lua contains utility functions, and cli.lua contains the CLI driver.

lua library

there are various ways to use cortav from a lua script; the simplest however is probably to precompile your script with luac and link in the necessary components of the implementation. for instance, say we have the following program

stdin2html.lua

local ct = require 'cortav'

local mode = {}

local doc = ct.parse(io.stdin, {file = '(stdin)'}, mode)

doc.stage = {

kind = 'render';

format = 'html';

mode = mode;

}

output:write(ct.render.html(doc, {accent = '320'}))

and the only extension we need is the table-of-contents extension. our script can be translated into a self-contained lua bytecode blob with the following command

$ luac -s -o stdin2html.lc $cortav_repo/{sirsem,cortav,ext/toc}.lua stdin2html.lua

and can then be operated with the command lua stdin2html.lc, with no further need for the cortav repository files. note that the order of the luac command is important! sirsem.lua must come first, followed by cortav.lua, followed by any extensions. your driver script (i.e. the script with the entry point into the application) should always come last.

building custom tools

generally, most existing file-format conversion tools (cmark, pandoc, and so on) have a crucial limitation: they hardcode specific assumptions like document structure. this means that the files they output are generally not suitable as-is for the users’ purposes, and require further munging, usually by hateful shell or perl scripts. some tools do provide libraries end users to use as a basis for designing their own tools, but these are often limited, and in any case the user ends up having to write their own (non-standard) driver. it’s no surprise that very few people end up doing this.

cortav.lua”s design lends itself to a more elegant solution. one can of course write their own driver using cortav as a library, but most of the time when you’re compiling document sources, you just want a binary you can run from the command line or a makefile. with cortav.lua, you can extend its capabilities easily while keeping the same driver.

in the cortav spec, extensions are mostly intended to give different implementations the ability to offer extra capabilities, but the reference implementation uses an extension architecture that makes it easy to write and add your own. for each type of new behavior you want to implement, just create a new extension and list it on the make command line:

$ nvim ~/dev/my-cortav-exts/imperial-edict.lua

$ make cortav extens+=$HOME/dev/my-cortav-exts/*.lua

the cortav binary this produces will have all the extra capabilities you personally need, without any need to fork cortav.lua itself or even touch the repository.

there’s no reason cortav.lua shouldn’t be able to load extensions at runtime as well; i just haven’t implemented this behavior yet. it probably would only take a few extra lines of code tho.

i will eventually document the extension API, but for now, look at ext/toc.lua for a simple example of how to register an extension.

command line driver

the cortav.lua command line driver can be run from the repository directory with the command lua ./cli.lua, or by first compiling it into a bytecode form that links in all its dependencies. this is the preferred method for installation, as it produces a self-contained executable which loads more quickly, but running the driver in script form may be desirable for development or debugging.

the repository contains a GNU makefile to automate compilation of the reference implementation on unix-like OSes. simply run $ make cortav or $ gmake cortav from the repository root to produce a self-contained bytecode executable that can be installed anywhere on your filesystem, with no dependencies other than the lua interpreter.

note that the makefile strips debugging symbols to save space, so running cli.lua directly as a script may be helpful if you encounter errors and need stacktraces or other debugging information.

henceforth it will be assumed that you have produced the cortav executable and placed it somewhere in your $PATH; if you are instead running cortav.lua directly as an interpreted script, you’ll need to replace $ cortav with $ lua ./cli.lua in incantations.

when run without commands, cortav.lua will read input from standard input and write to standard output. alternately, a source file can be given as an argument. to write to a specific file instead of the standard output stream, use the -o file flag.

$ cortav readme.ct -o readme.html

# reads from readme.ct, writes to readme.html

$ cortav -o readme.html

# reads from standard input, writes to readme.html

$ cortav readme.ct

# reads from readme.ct, writes to standard output

building

the command line driver is built and installed with a GNU make script. this script accepts the variables shown below with their default values:

prefix	`$HOME/.local`	the path under which the package will be installed
build	`build`	the directory where generated objects will be placed; useful for out-of-tree builds
bin-prefix	`$prefix/bin`	directory to install the executables to"
default-format-flags	`-m html:width 35em`	a list of flags that will be passed by the viewer script to `cortav` when generating a html fille

the following targets are supplied to automate the build:

install builds everything, installs the bytecode-executable and the viewer script to $bin_prefix, and registers the viewer script with XDG
install-bin is like install but installs the binary version instead of the bytecode one
excise deletes everything installed and deregisters the file handlers (note that the same variables must be passed to excise as were passed to install!)
clean deletes build artifacts from the $build directory like it was never there
wipe is equivalent to $ make excise && make clean

if you don’t want to install cortav, you can just run $ make without arguments to build the executable.

there are two different ways of building the driver. one is to generate a bytecode file that can be executed directly as a script. this is the most straightforward method, and requires only lua and luac. however, it has several substantial downsides: because it’s only a bytecode file, it requires the lua interpreter to run — and in some environments, the security characteristics of the lua interpreter may make this undesirable. it also must hardcode the path to the lua interpreter (though admittedly this is easy enough to fix if you copy it to another machine of the same architecture). lua is not an entirely predictable environment, as it is controlled by environment variables and may hypothetically do things like load default libraries or alter paths in ways that disrupt the workings of cortav. finally, because the bytecode file is not a binary executable, it cannot directly be given enhanced capabilities on unix-like systems through filesystem metadata — SUID and caps will be ignored by the kernel. while this is of no importance in ordinary operation, there are niche cases where this could be troublesome.

a potentially superior alternative is to build cortav as a directly executable binary. when you tell make to build the binary version, it first compiles the driver to raw bytecode, then invokes tool/makeshim.lua to create a C source file embedding that bytecode, which is then piped into a C compiler. the huge downside, of course, is that building the cortav driver in this way requires a C compiler. however, the binary that it produces is easier to distribute to other computers — you can even statically link in lua so it can run on systems where lua isn’t installed.

to build the binary version, run $ make build/cortav.bin. if you want to make the build to link lua statically, you’ll additionally need to supply lua’s library prefix in the variable lua-lib-prefix. some example incantations:

$ make build/cortav.bin lua-lib-prefix=/usr/lib on most Linux distros
$ make build/cortav.bin lua-lib-prefix=/usr/local/lib on FreeBSD
$ make build/cortav.bin lua-lib-prefix=$(nix path-info nixpkgs.lua5_3)/lib on NixOS, or on OSX if you’re using the Nix package manager

alternately, you can build lua yourself and link the static library in place without installing it systemwide, which is useful if you want to build a specialized version of lua to link with (or if the sysop doesn’t want your grubby luser hands all over his precious filesystem). note that if you’re building a self-contained version of cortav to distribute, you may want to slim down the binary by building lua without its parser, as the self-contained version of the driver only needs the bytecode VM part of lua to run.

build variables

there are numerous variables you can use to control the build process.

lua	path to the lua interpreter `cortav` should be built and run with
luac	path to the lua compiler
sh	path to a bourne-compatible shell
extens	list of paths to extensions to enable, defaults to `ext/*.lua`. use `extens+=path` to add additional extensions from out of tree
rendrs	list of paths to renderers to enable, defaults to `render/*.lua`
build	path to the build directory, defaults to `build`. change this for out-of-tree builds
executable	name of the executable to be generated, defaults to `cortav`
default-format-flags	specifies command line options that the viewer script should pass to `cortav`
prefix	where files should be installed, defaults to `$HOME/.local`
bin-prefix	where executables should be installed, defaults to `$(prefix)/bin`
debug	if set, builds executables with debugging symbols; if absent, executables are stripped
encoding-data	if set, embeds character class data for supported multibyte encodings into the program. on by default; `$ make encoding-data=` to unset
encoding-data-ucs	path to the UnicodeData.txt file for UCS-based encodings like UTF-8. by default it is automatically downloaded with `curl`
encoding-data-ucs-url	where to download UnicodeData.txt from, if encoding-data-ucs is not changed. defaults to the unicode consortium website

deterministic builds

some operating systems, like NixOS, require packages that can be built in reproducible ways. this implies that all data, all state that goes into producing a package needs to be accounted for before the build proper begins. the cortav build process needs to be slightly altered to support such a build process.

while the cortav specification itself does not concern itself with matters like whether a particular character is a numeral or a letter, optimal typesetting in some cases requires such information. this is the case for the equation span- and block-types, which need to be able to distinguish between literals, variables, and mathematical symbols in the equations they format⁵. the ASCII charset is small enough that exhaustive character class information can be manually hardcoded into a cortav implementation, the various encodings of Unicode most certainly are not.

for this reason, the reference implementation of cortav embeds the file UnicodeData.txt, a database maintained by the Unicode Consortium. this is a rather large file that updates for each new Unicode version, so it is downloaded as part of the build process. to build on NixOS, you’ll need to either disable the features that rely on this database (not recommended), or download the database yourself and tell the build script where to find it. this is the approach the official nix expression will take when i can be bothered to write it. see the examples below for how to conduct a deterministic build

deterministic build with unicode database

/src $ mkdir cortav && cd cortav

/src/cortav $ fossil clone https://c.hale.su/cortav .fossil && fossil open .fossil

/src/cortav $ curl https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt > /tmp/unicode.txt

/src/cortav $ make build/cortav encoding-data-ucs=/tmp/unicode.txt

deterministic build without unicode database

/src $ mkdir cortav && cd cortav

/src/cortav $ fossil clone https://c.hale.su/cortav .fossil && fossil open .fossil

/src/cortav $ make build/cortav encoding-data=

do note that no cortav implementation needs to concern itself with character class data. this functionality is provided in the reference implementation strictly as an (optional) extension to the spec to improve usability, not as a normative requirement.

switches

cortav.lua offers various switches to control its behavior.

long	short	function
`--out file`	`-o`	sets the output file (default stdout)
`--log file`	`-l`	sets the log file (default stderr)
`--define var val`	`-d`	sets the context variable `var` to `val`
`--mode-set mode`	`-y`	activates the mode with ID mode
`--mode-clear mode`	`-n`	disables the mode with ID mode
`--mode id val`	`-m`	configures mode `id` with the value `val`
`--mode-set-weak mode`	`-Y`	activates the mode with ID `mode` if the source file does not specify otherwise
`--mode-clear-weak mode`	`-N`	disables the mode with ID `mode` if the source file does not specify otherwise
`--mode-weak id val`	`-M`	configures mode `id` with the value `val` if the source file does not specify otherwise
`--help`	`-h`	display online help
`--version`	`-V`	display the interpreter version

modes

most of cortav.lua”s implementation-specific behavior is controlled by use of modes. these are namespaced options which may have a boolean, string, or numeric value. boolean modes are set with the -y -n flags; other modes use the -m flags.

most modes are defined by the renderer backend. the following modes affect the behavior of the frontend:

ID	type	effect
`render:format`	string	selects the renderer (default `html`)
`parse:show-tree`	flag	dumps the parse tree to the log after parsing completes

renderers

cortav.lua implements a frontend-backend architecture, separating the parsing stage from the rendering stage. this means new renderers can be added to cortav.lua relatively easily. currently, only an HTML renderer is included; however, a groff backend is planned at some point in the future, so that PDFs and manpages can be generated from cortav files.

html

the HTML renderer is activated with the incantation -m render:format html. it is currently the default backend. it produces a single HTML file, optionally with CSS styling data, from a .ct input file.

modes

html supports the following modes:

string (css length) html:width sets a maximum width for the body content in order to make the page more readable on large displays
number html:accent applies an accent hue to the generated webpage. the hue is specified in degrees, e.g. -m html:accent 0 applies a red accent.
flag html:dark-on-light uses dark-on-light styling, instead of the default light-on-dark
flag html:fossil-uv outputs an HTML snippet suitable for use with the Fossil VCS webserver. this is intended to be used with the unversioned content mechanism to host rendered versions of documentation written in cortav that’s stored in a Fossil repository.
flag html:xhtml generates syntactically-`valid’ XHTML5
flag html:epub generates XHTML5 suitable for use in an EPUB3 archive
number html:hue-spread generates a color palette based on the supplied accent hue. the larger the value, the more the other colors diverge from the accent hue.
string html:link-css generates a document linking to the named stylesheet
flag html:gen-styles embeds appropriate CSS styles in the document (default on)
flag html:snippet produces a snippet of html instead of an entire web page. note that proper CSS scoping is not yet implemented (and can’t be implemented hygienically since scoped was removed 😢)
string html:title specifies the webpage titlebar contents (normally autodetected from the document based on headings or directives)
string html:font specifies the default font to use when rendering as a CSS font specification (e.g. -m html:font ”Alegreya, Junicode, Georgia, “Times New Roman”)

$ cortav readme.ct --out readme.html \

-m render:format html \

-m html:width 40em \

-m html:accent 80 \

-m html:hue-spread 35 \

-y html:dark-on-light # could also be written as:

$ cortav readme.ct -ommmmy readme.html render:format html html:width 40em html:accent 80 html:hue-spread 35 html:dark-on-light

directives

html supplies the following render directives.

%html link rel mime href: inserts a <link> tag in the header, for example, to link in an alternate stylesheet, or help feed readers find your atom or rss feed.
- %html link alternate\ stylesheet text/css /res/style2.css
- %html link alternate application/atom+xml /feed.atom
%html style id: adds the stylesheet referenced by id into the document stylesheet. the stylesheet is specified using a resource.

stylesheets

the html backend offers some additional directives for external CSS files that are embedded into the document, in order to simplify integration with the accent mechanism. these are:

@fg: resolves to a color expression denoting the selected foreground color. equivalent to tone(1)
@bg: resolves to a color expression denoting the selected background color. equivalent to tone(0)
@tone[/alpha](fac [shift [saturate]] ): resolves to a color expression. fac is a floating-point value scaling from the background color to the foreground color. shift is a value in degrees controlling how far the hue will shift relative to the accent. saturate is a floating-point value controlling how satured the color is.

groff

the groff backend produces a text file suitable for supplying to a groff compiler. groff is the GNU implementation of a venerable typesetting system from the early days of UNIX

you can produce a final output directly by piping from the cortav driver into groff. if your document uses an encoding other than ASCII, you’ll need to notify groff of this with the -K flag. for example, to render a UTF8 cortav file to PDF:

$ cortav input.ct -m render:format groff | groff -Tpdf -Kutf8 > output.pdf

in the future, it is planned to enable the driver to operate groff automatically and directly produce the desired output format when the binary wrapper is in use. doing so securely and hygienically is not possible in pure lua, however.

modes

groff supports the following modes:

string groff:annotate controls how footnotes will be handled.
- footnote places footnotes at the end of the page they are referenced on. if the same footnote is used on multiple pages, it will be duplicated on each.
- secnote places footnotes at the end of each section. footnotes used in multiple sections will be duplicated for each
- endnote places all footnotes at the end of the rendered document.
string groff:title-page takes an identifier that names a section. this section will be treated as the title page for the document.
string groff:title sets a specific title to be used in headers instead of relying on header heuristics

directives

%pragma title-page id sets the title page to section id. this causes it to be specially formatted, with a large, centered title and subtitle.

quirks

if the toc extension is active but %toc directive is provided, the table of contents will be given its own section at the start of the document (after the title page, if any).

further directions

additional backends

it is eventually intended to support to following backends, if reasonably practicable.

html: emit HTML and CSS code to typeset the document. in progress
svg: emit SVG, taking advantage of its precise layout features to produce a nicely formatted and paginated document. pagination can perhaps be accomplished through emitting multiple files (somewhat problematic) or by assigning one layer to each page. long term
groff: the most important output backend, rivalling html. will allow the document to be typeset in a wide variety of formats, including PDF and manpage. in progress
gemtext: essentially a downrezzing of cortav to make it readable to Gemini clients
ast: produces a human- and/or machine-readable dump of the document’s syntax tree, to aid in debugging or for interoperation with systems that do not support `cortav` direcly. mode ast:repr wil allow selecting formats for the dump. ast:rel can be tree (the default) to emit a hierarchical representation, or flat to emit an array of nodes that convey hierarchy by naming one another⁶, rather than being placed inside one another. tree is easier for humans to parse; flat is easier for computers. origin information can be included for each node with the flag ast:debug-syms, but be aware this will greatly increase file size.
- tabtree (default): a hierarchical tree view, with the number of tabs preceding an item showing its depth in the tree
- sexp
- binary: emit a raw binary format that is easier for programs to read. maybe an lmdb or cdb file?
- json: obligatory, alas

some formats may eventually warrant their own renderer, but are not a priority:

text: cortav source files are already plain text, but a certain amount of layout could be done using ascii art.
ansi: emit sequences of ANSI escape codes to lay out a document in a terminal-friendly way
tex: TeX is an unholy abomination and i neither like nor use it, but lots of people do and if cortav ever catches on, a TeX backend should probably be written eventually.

PDF is not on either list because it’s a nightmarish mess of a format and groff, which is installed on most linux systems already, can easily generate PDFs

LCH support

right now, the use of color in the HTML renderer is very unsatisfactory. the accent mechanism operates on the basis of the CSS HSL function, which is not perceptually uniform; different hues will present different mixes of brightness and some (yellows?) may be ugly or unreadable.

the ideal solution would be to simply switch to using LCH based colors. unfortunately, only Safari actually supports the LCH color function right now, and it’s unlikely (unless Lea Verou and her husband manage to work a miracle) that Colors Level 4 is going to be implemented very widely any time soon.

this leaves us in an awkward position. we can of course do the math ourselves, working in LCH to implement the internal @tone macro, and then “converting” these colors to HSL. unfortunately, you can’t actually convert from LCH to HSL; it’s like converting from pounds to kilograms. LCH can represent any color the human visual system can perceive; sRGB can’t, and CSS HSL is implemented in sRGB. however, we could at least approximate something that would allow for perceptually uniform brightness, which would be an improvement, and this is probably the direction to go in, unless a miracle occurs and lch() or color() pop up in Blink.

it may be possible to do a more reasonable job of handling colors in the postscript and TeX outputs. unsure about SVG but i assume it suffers the same problems HTML/CSS do. groff lets us choose between rgb and cmyk for specifying color, with no explanation of what those mean.

currently all internal colors are expressed and stored as HSL, and we’re using code borrowed from my Minetest mod sorcery & written by glowpelt for converting that into RGB for non-HTML outputs like groff. probably what should be done is to switch to LCH internally, accept both HSL and LCH input, and “downres” to the best representation each format can support. that’s probably going to have to wait until someone who knows a bit more about color theory and math than me can do it (paging Lea Verou)

intent files

there’s currently no standard way to describe the intent and desired formatting of a document besides placing pragmata in the source file itself. this is extremely suboptimal, as when generating collections of documents, it’s ideal to be able to keep all formatting information in one place. users should also be able to specify their own styling overrides that describe the way they prefer to read cortav files, especially for uses like gemini or gopher integration.

at some point soon cortav needs to address this by adding intent files that can be activated from outside the source file, such as with a command line flag or a configuration file setting. these will probably consist of lines that are interpreted as pragmata. in addition to the standard intent format however, individual implementations should feel free to provide their own ways to provide intent metadata; e.g. the reference implementation, which has a lua interpreter available, should be able to take a lua script that runs after the parse stage and makes arbitrary alterations to the AST. this will be particularly useful for the end-user who wishes to specify a particular format she likes reading her files in without forcing that format on everyone she sends the compiled document to, as it will be able to interrogate the document and make intelligent decisions about what pragmata to apply.

intent files should also be able to define resources, context variables, and macros.

implementation license

the cortav reference implementation is made available under the terms of the GNU Affero General Public License v3. the binding license text may be found in the cortav source control tree at the following paths:

language	license text location
english	`legal/agpl.en`

should the texts be interpreted to conflict in translation, the most restrictive subset of terms shall apply.

trademarks

the name “cortav” is a trademark of alexis hale, and may be used only insofar as the following terms apply:

the name “cortav” is applied to an implementation of the cortav language that strictly conforms to at least level 1 of this specification
the name is not used unqualified; i.e. no project may name itself simply “cortav”. below are some examples of permissible names under this term:
1. cortav-scheme
2. cortav.c
3. pycortav

this grant may be revoked at any time, for any reason, by the trademark owner. if you wish to use the name “cortav” in contravention of this grant or simply require stronger legal guarantees, feel free to contact me and we can probably work something out as long as you’re not some corporate asshole.

flat sexp example output

(nodes

(section (id . "section1")

(anchor "introduction")

(kind . "ordinary")

(label . "section1-heading")

(nodes

"section1-heading"

"para1"

"para2"

"hzrule"

"para3"))

(section (id . "section2")

(kind . "ordinary")

(label . "section2-heading")

(nodes

"para4"

"hzrule"

"para5"

"list1"))

(block list (id . "list1")

(kind . "ordered")

(nodes

"para6"

"list2"

"para7"))

(block list (id . "list2")

(kind . "unordered")

(nodes

"para8"

"para9"

"para10"))

(block para (id . "para1")

(nodes "text1" "format1" "text3" "foonote1" "text4"))

(block label (id . "section1-heading") (nodes "section1-heading-text"))

(text (id . "section1-heading-text") "Contemplating the Anathema")

(text (id . "text1")

"Disquieting information has recently been disclosed to virtual journalists of the Giedi Prime infomatrix by sources close to the Hyperion Entity regarding the catatrophic Year of Schisms and the unidentified agents believed to be responsible for memetically engineering the near-collapse of the Church Galactic.")

(span format (id . "format1")

(style . "emph")

(nodes . "text2"))

(text (id . "text2") "Curiously,")

(text (id . "text3") "his Cyber-Holiness")

(text (id . "footnote1-caption-text") "Pope Chewbacca III")

(span footnote (id . "footnote1")

(note . "footnote1-text")

(ref . "papal-disclaimer")

(nodes

"footnode1-caption-text"))

(text (id . "text4") "has thus far had little to say on the matter, provoking rampant speculation among the faithful.")

(footnote-def (id . "footnote1-def")

(nodes "footnote1-text")

(text (id . "footnote1-text") "Currently recognized as legitimate successor to Peter of Terra by 2,756 sects, rejected by 678 of mostly Neo-Lutheran origin, and decried as an antipope by 73, most notably Pope Peter II of Centaurum Secundus, leader of the ongoing relativistic crusade against star systems owned by Microsoft.")

;;; snip ;;;

(document

(nodes

"section1" "section2")))

⤫

cortav cortav specification

cortav specification

cortav vs. markdown

encoding

file type

levels

structure (block elements)

styled text (span elements)

tables

identifiers

resources

supported parameters

context variables

fonts

custom styles

directives

examples

extensions

toc

transmogrify

hilite

lua

ts

specification license

reference implementation

lua library

building custom tools

command line driver

building

build variables

deterministic builds

switches

modes

renderers

html

modes

directives

stylesheets

groff

modes

directives

quirks

further directions

additional backends

LCH support

intent files

implementation license

trademarks