Index: cortav.ct ================================================================== --- cortav.ct +++ cortav.ct @@ -19,17 +19,17 @@ ** you must put a space between the control sequence (the sequence of hashes or section symbols, in this case) and the title text. [`# title] creates a section with the heading text "title", but [`#title] creates a new section with no heading at all; instead, it gives the anonymous section the ID [`title]. and of course, you can combine the two: [`#ttl title] creates a section with the heading text "title" and the ID [`ttl]. what are IDs for? we'll get to that in a little bit * [*paragraphs] are mostly the same as in markdown, except that a paragraph break occurs after every newline character, not every blank line. paragraphs can be indented by however many spaces you like; such indentation will be ignored. (tabs have a special meaning, however). in cortav, you can also explicitly mark a line of text as a paragraph by preceding it with a period character ([`.]), which is useful if you want to start a paragraph with text that would otherwise be interpreted specially. * [*italic text] -- or rather, [!emphasized] text -- is written as [`\[!my spiffy italic text\]]. in cortav, these spans can be nested within other spans (or titles, or table cells, or…), and the starting and ending point is unambiguous. * [*bold text] -- or rather, [*strong] text -- is written as [`\[*my commanding bold text\]]. * [*bold-italic text] -- or rather, [![*emphasized strong text]] -- has no specific notation. rather, you create it by nesting one span within the other, for instance: [`\[*[!my ostentatious bold-italic text\]]]. -* [*links] are quite different from their markdown equivalents. cortav does not have inline links, as it is intended to be easily readable in both formatted and plain-text format, and long URLs rather disrupt the flow of reading. rather, a link tag is written with the notation [`\[>nifty-link my nifty link\]], where the word [`nifty-link] immediately following the arrow is an [!identifier] indicating the destination of the link. (instead of a greater-than sign, you can also use the unicode arrow symbol [`→].) if the identifier is the same as one you've assigned to a section, cortav produces a link within the document to that section. otherwise, it will look for a [!reference] to tell it the URI for the link. a reference is a key-value pair created by adding a line like [`nifty-link: https://zombo.com] [!indented by exactly one tab]. you can place this reference anywhere you like so long as it's in the same section; if you want to name a reference in another section, you have to prefix it with that section's ID, e.g. [`\[>spiffy-section.nifty-link my nifty link declared in a spiffy section\]]. +* [*links] are quite different from their markdown equivalents. cortav does not have inline links, as it is intended to be easily readable in both formatted and plain-text format, and long URLs rather disrupt the flow of reading. rather, a link tag is written with the notation [`\[>nifty-link my nifty link\]], where the word [`nifty-link] immediately following the arrow is an [!identifier] indicating the destination of the link. (instead of a greater-than sign, you can also use the unicode arrow symbol [`→].) if the identifier is the same as one you've assigned to a document object, such as a section, cortav produces a link within the document to that object. otherwise, it will look for a [!reference] (or failing that, a [>rsrc resource]) to tell it the URI for the link. if nothing in the document matches the ID, an error will result and compilation will be aborted. (a reference is a key-value pair created by adding a line like [`nifty-link: https://zombo.com] [!indented by exactly one tab]. you can place this reference anywhere you like so long as it's in the same section; if you want to name a reference in another section, you have to prefix it with that section's ID, e.g. [`\[>spiffy-section.nifty-link my nifty link declared in a spiffy section\]].) * [*lists] use a different syntax from markdown. you can start a line with a [`*] to create an unordered list, or [`:] to create an ordered list; indentation doesn't matter. if you want to nest list items, instead of putting two spaces before the child item, you just add another star or colon. and of course, you can nest lists of different kinds within one another. -* [*horizontal rules] use roughly the same syntax: three or more hyphens on a line of their own ([`\---]). underlines also work ([`___], [`-_-], [`__-__-__] etc). -* some markdown implementations support [*tables]. cortav does too, using a very simple notation. +* [*horizontal rules] use roughly the same syntax: three or more hyphens on a line of their own ([`\---]). underlines also work ([`___], [`-_-], [`__-__-__] etc), as do horizontal unicode box drawing characters ([`─ ━ ┈] etc). +* some markdown implementations support [*tables]. cortav does too, using a very simple notation similar to the usual notation used in markdown. a key difference, however, is that cortav table cells can contain any formatting a paragraph can. * [*underlines] are supported by some markdown implementations. in cortav, you can apply them with the notation [`\[_my underlined text\]] -- please just use them sparingly when you render to HTML! * [*strikethrough] is supported by some extended versions of markdown. cortav uses the notation [`\[~my deleted text\]], with the intended semantics of text that is being removed by some revision of a document. (you can also denote text that is being [!added] by using a plus sign instead of a tilde) -* [*images] are a bit more complicated. see the section on [>rsrc resources] for an explanation. +* [*images] are a bit more complicated, but much more versatile. see the section on [>rsrc resources] for an explanation. * [*smart quotes] and [*em dashes] are inserted automatically, just as in markdown, provided you have the [>tsmog transmogrify] extension available. (it is part of the reference implementation and defined by the spec, but not required.) in fact, you can insert longer dashes than em dashes just by increasing the number of hyphens. the reference implementation's transmogrifier also translates ascii arrows like [`\-->] into their unicode equivalents ([`→]). * [*literals] (also known as [*code text]) can be inserted with the [`\[`int main(void);] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you're quoting makes use of such forbidden runes. in this case, you'll want to wrap the code span in a raw span. the syntax for this is [`\[`[\\int main(void);\]]], but since this is a bit unwieldy it can also be abbreviated as [`\[`\\int main(void);\]]. of course, this is only a small taste of what cortav can do, not even touching on key features like macros, footnotes, or equation formatting. read the sections on [>onblocks blocks] and [>onspans spans] for all the gory details. @@ -48,42 +48,50 @@ * [`ctc] is the shorthand extension * [`cortavcun] is the canonical disambiguation extension * [`] is the canonical Corran extension, a byte sequence comprising the unicode codepoints [`U+E3CE U+E3BD U+E3CE]. where the filesystem in question does not specify a filename encoding, the bytes should be expressed in UTF-8. on systems which use metadata to encode filetype, two values are defined to identify cortav source files -* [`text/x-cortav] should be used when strings or arbitrary byte sequences are supported +* [`text/x.cortav] should be used when strings or arbitrary byte sequences are supported * [`CTAV] (that is, the byte sequence [`0x43 54 41 56]) should be used on systems that support only 32-bit file types/4-character type codes like Classic Mac OS. two more values are defined to identify cortav intent files. -* [`text/x-cortav-intent] +* [`text/x.cortav-intent] * [`CTVC] (the byte sequence [`0x43 54 56 43]) -on systems which do not define a canonical way of encoding the filetype but support extended attributes of some kind, such as linux, an attribute named [$mime] may be created and given the value [`text/x-cortav] or [`text/x-cortav-intent]; alternatively, extensions may be used. +on systems which do not define a canonical way of encoding the filetype but support extended attributes of some kind, such as linux, an attribute named [$mime] may be created and given the value [`text/x.cortav] or [`text/x.cortav-intent]; alternatively, extensions may be used. -it is also possible to indicate the nature of a cortav file without using filesystem metadata. this is done by prefixing the file with a magic byte sequence. the sequence used depends on the encoding. +it is also possible to indicate the nature of a cortav file without using filesystem metadata. this is done by prefixing the file with a magic byte sequence. the sequence used depends on the encoding. currently, only sequences for UTF-8 and ASCII are defined, as these are the only encodings supported by the reference implementation. in the event that other implementations add support for other encodings, their sequences will be standardized here. * for UTF-8 and ASCII plain text files, [`%ct[!\\n]] (that is, the byte sequence [`0x25 63 74 0A]) should be used -* for C6B+PS files (parastream), the file should begin with the paragraph [`], which equates to the byte sequence [` 0x3E 2E 14 0C 01 04 00 00 00 03 07 3E 2D], including the parastream header). consequently, this sequence should be ignored by a cortav parser at the start of a file (except as an indication of file format). for FreeDesktop-based systems, the [`build/velartrill-cortav.xml] file included in the repository supplies mappings for the extensions and magic byte sequences. a script is also included which can be registered with xdg-open so that double-clicking on a cortav file will render it out and open it in your default web browser. [`$ make install] will generate the necessary FreeDesktop XML files and register them, as well as install the script and the [`cortav] executable itself. for more information see [>refimpl-build building the reference implementation]. +##levels levels +not all of cortav's features make sense in every context. for this reason, cortav defines N [!levels] of compliance. for example, a social media platform that enables simple paragraph styling and linking using cortav syntax may claim to be "cortav level 1 compliant". every level [=N] is a strict superset of level [=N-1]. +* level 1: [*styling]. simple inline formatting sequences like strong, emphatic, literal, links, etc. math equation styling need not be supported. paragraphs, lists, and references are the only block elements supported. suitable for styling tweets and other very short content. +* level 2: [*layout]. implements header, paragraph, newline, directive, and reference block elements. supports resources at least for remote or attached images. suitable for longer social media posts. +* level 3: [*publishing]. implements all currently standardized core behavior, including zero or more extensions. +* level 4: [*reference]. implements all currently standardized behavior, including [!all] standardized extensions. + +! note that which translators are implemented is not specified by level, as this is, naturally, implementation-dependent. (it would make rather little sense for the blurb parser of a cortav-enabled blog engine to support generating PDFs, after all.) level encodes only which features of the cortav [!language] are supported. + ##onblocks structure -cortav is based on an HTML-like block model, where a document consists of sections, which are made up of blocks, which may contain a sequence of spans. flows of text are automatically conjoined into spans, and blocks are separated by one or more newlines. this means that, unlike in markdown, a single logical paragraph [*cannot] span multiple ASCII lines. the primary purpose of this was to ensure ease of parsing, but also, both markdown and cortav are supposed to be readable from within a plain text editor. this is the 21st century. every reasonable text editor supports soft word wrap, and if yours doesn't, that's entirely your own damn fault. +cortav is based on an HTML-like block model, where a document consists of sections, which are made up of blocks, which may contain a sequence of spans. flows of text are automatically conjoined into spans, and blocks are separated by one or more newlines. this means that, unlike in markdown, a single logical paragraph [*cannot] span multiple ASCII lines. the primary purpose of this was to ensure ease of parsing, but also, both markdown and cortav are supposed to be readable from within a plain text editor. this is the 21st century. every reasonable text editor supports soft word wrap, and if yours doesn't, that's entirely your own damn fault. hard-wrapping lines is incredibly user-hostile, especially to users on mobile devices with small screens. cortav does not allow it. the first character(s) of every line (the "control sequence") indicates the role of that line. if no control sequence is recognized, the line is treated as a paragraph. the currently supported control sequences are listed below. some control sequences have alternate forms, in order to support modern, readable unicode characters as well as plain ascii text. * [*paragraphs] ([`.] [` ¶] [`❡]): a paragraph is a simple block of text. the period control sequence is only necessary if the paragraph text starts with text that would be interpreted as a control sequence otherwise * newlines [` \\]: inserts a line break into previous paragraph and attaches the following text. mostly useful for poetry or lyrics -* [*section starts] [`#] [`§]: starts a new section. all sections have an associated depth, determined by the number of sequence repetitions (e.g. "###" indicates depth-three"). sections may have headers and IDs; both are optional. IDs, if present, are a sequence of raw-text immediately following the hash marks. if the line has one or more space character followed by styled-text, a header will be attached. the character immediately following the hashes can specify a particular type of section. e.g.: +* [*section starts] [`#] [`§]: starts a new section. all sections have an associated depth, determined by the number of sequence repetitions (e.g. "###" indicates depth three). sections may have headers and IDs; both are optional. IDs, if present, are a sequence of raw-text immediately following the hash marks. if the line has one or more space character followed by styled-text, a header will be attached. the character immediately following the hashes can specify a particular type of section. e.g.: ** [`#] is a simple section break. ** [`#anchor] opens a new section with the ID [`anchor]. ** [`# header] opens a new section with the title "header". ** [`#anchor header] opens a new section with both the ID [`anchor] and the title "header". -* [*nonprinting sections] ([`^]): sometimes, you'll want to create a namespace without actually adding a visible new section to the document. you can achieve this by creating a [!nonprinting section] and defining resources within it. nonprinting sections can also be used to store comments, notes, or other information that is useful to have in the source file without it becoming a part of the output -* [*resource] ([`@]): defines a [!resource]. a resource is an file or object that exists outside of the document but which will be included in the document somehow. common examples of resources include images, videos, iframes, or headers/footers. see [>rsrc resources] for more information. -* [*lists] ([`*] [`:]): these are like paragraph nodes, but list nodes that occur next to each other will be arranged so as to show they compose a sequence. depth is determined by the number of stars/colons. like headers, a list entry may have an ID that can be used to refer back to it; it is indicated in the same way. if colons are used, this indicates that the order of the items is signifiant. :-lists and *-lists may be intermixed; however, note than only the last character in the sequence actually controls the depth type. -* [*directives] ([`%]): a directive issues a hint to the renderer in the form of an arbitrary string. directives are normally ignored if they are not supported, but you may cause a warning to be emitted where the directive is not supported with [`%!] or mark a directive critical with [`%!!] so that rendering will entirely fail if it cannot be parsed. +* [*nonprinting sections] ([`^]): sometimes, you'll want to create a namespace without actually adding a visible new section to the document. you can achieve this by creating a [!nonprinting section] and defining resources within it. nonprinting sections can also be used to store comments, notes, to-dos, or other meta-information that is useful to have in the source file without it becoming a part of the output. nonprinting sections can be used for a sort of "literate markup," where resource and reference definitions can intermingle with human-readable narrative about those definitions. +* [*resource] ([`@]): defines a [!resource]. a resource is a file or object that exists outside of the document but which will are to be included in the document somehow. common examples of resources include images, videos, iframes, or headers/footers. see [>rsrc resources] for more information. +* [*lists] ([`*] [`:]): these are like paragraph nodes, but list nodes that occur next to each other will be arranged so as to show they compose a sequence. depth is determined by the number of stars/colons. like headers, a list entry may have an ID that can be used to refer back to it; it is indicated in the same way. if colons are used, this indicates that the order of the items is signifiant. [`:]-lists and [`*]-lists may be intermixed; however, note than only the last character in the sequence actually controls the type. a blank line terminates the current list. +* [*directives] ([`%]): a directive issues a hint to the renderer in the form of an arbitrary string. directives are normally ignored if they are not supported, but you may cause a warning to be emitted where the directive is not supported with [`%!] or mark a directive critical with [`%!!] so that rendering will entirely fail if it cannot be obeyed. * [*comments] ([`%%]): a comment is a line of text that is simply ignored by the renderer. * [*asides] ([`!]): indicates text that diverges from the narrative, and can be skipped without interrupting it. think of it like block-level parentheses. asides which follow one another are merged as paragraphs of the same aside, usually represented as a sort of box. if the first line of an aside contains a colon, the stretch of styled-text from the beginning to the aside to the colon will be treated as a "type heading," e.g. "Warning:" * [*code] ([`~~~]): a line beginning with ~~~ begins or terminates a block of code. code blocks are by default not parsed, but parsing can be activated by preceding the code block with an [`%[*expand]] directive. the opening line should look like one of the below ** [`~~~] ** [`~~~ language] (markdown-style shorthand syntax) @@ -94,24 +102,25 @@ ** [`~~~ \[language\] title ~~~] ** [`~~~ title \[language\] #id ~~~] *[*reference] (tab): a line beginning with a tab is treated as a "reference." references hold out-of-line metadata for preceding text like links and footnotes. a reference consists of an identifier followed by a colon and an arbitrary number of spaces or tabs, followed by text. whether this text is interpreted as raw-text or styled-text depends on the context in which the reference is used. in encodings without tab characters, two preceding blanks can be used instead. * [*quotation] ([`<]): a line of the form [`<[$name]> [$quote]] denotes an utterance by [$name]. * [*blockquote] ([`>]): alternate blockquote syntax. can be nested by repeating the [`>] character. -* [*subtitle] ([`--]): attaches a subtitle to the previous header +* [*subtitle/caption] ([`\--]): attaches a subtitle to the previous header, or caption to the previous object * [*embed] ([`&]): embeds a referenced object. can be used to show images or repeat previously defined objects like lists or tables, optionally with a caption. ** [`&$[$macro] [$arg1]|[$arg2]|[$argn]…] invokes a block-level macro with the supplied arguments *** [`&$mymacro arg 1|arg 2|arg 3] ** [`&[$image]] embeds an image or other block-level object. [!image] can be a reference with a url or file path, or it can be an embed section (e.g. for SVG files) ***[`&myimg All that remained of the unfortunate blood magic pageant contestants and audience (police photo)] -** [`&-[$section] [$styled-text]] embeds a closed disclosure element. in interactive outputs, this will display as a block [!section] which can be clicked on to view the full contents of the referenced section; if [$styled-text] is present, it overrides the title of the section you are embedding. in static outputs, the disclosure object will display as an enclosed box with [$styled-text] as the title text +** [`&-[$ident] [$styled-text]] embeds a closed disclosure element containing the text of the named object (a nonprinting section or cortav resource should usually be used to store the content; it can also name an image or video, of course). in interactive outputs, this will display as a block which can be clicked on to view the full contents of the referenced object [$ident]; if [$styled-text] is present, it overrides the title of the section you are embedding (if any). in static outputs, the disclosure object will display as an enclosed box with [$styled-text] as the title text *** [`&-ex-a Prosecution Exhibit A (GRAPHIC CONTENT)] ** [`&+[$section] [$styled-text]] is like the above, but the disclosure element is open by default -* [*horizontal rule] ([`\---]): inserts a horizontal rule or other context break; does not end the section. must be followed by newline. underlines can also be used in place of dashes. -* [*page break] ([`\^^]): for formats that support pagination, like HTML (when printed), indicates that the rest of the current page should be blank. for formats that do not, extra margins will be inserted. does not create a new section -* [*page rule] ([`\^-^]): inserts a page break for formats that support them, and a horizontal rule for formats that do not. does not create a new section +* [*horizontal rule] ([`\---]): inserts a horizontal rule or other context break; does not end the section. must be followed by newline. underlines can also be used in place of dashes ([`___], [`-_-], [`__-__-__] etc), as can horizontal unicode box drawing characters ([`─ ━ ┈] etc). +* [*page break] ([`\^^]): for formats that support pagination, like EPUB or HTML (when printed), indicates that the rest of the current page should be blank. for formats that do not, extra margins will be inserted. does not create a new section +* [*page rule] ([`\^-^]): inserts a page break for formats that support them, and a horizontal rule for formats that do not. does not create a new section. comprised of any number of horizontal rule characters surrounded by a pair of carets (e.g. [`^-^] [`^_^] [`^----^] [`^__--^] [`^┈┈┈┈┈^]) * [*table cells] ([`+ |]): see [>ex.tab table examples]. -* [*equations] ([`=]) block-level equations can be inserted with the [`=] +* [*equations] ([`=]): block-level equations can be inserted with the [`=] sequence +* [*cross-references] ([`=>] [`⇒]): inserts a block-level link. uses the same syntax as span links ([`⇒[$ident] [$styled-text]]). can be followed by a caption to add a longer descriptive text. especially useful for gemtext output. ident can be omitted to cross-reference, for example, a physical book. * [*empty lines] (that is, lines consisting of nothing but whitespace) constitute a [!break], which terminates multiline objects that do not have a dedicated termination sequence, for example lists and asides. ##onspans styled text most blocks contain a sequence of spans. these spans are produced by interpreting a stream of [*styled-text] following the control sequence. styled-text is a sequence of codepoints potentially interspersed with escapes. an escape is formed by an open square bracket [`\[] followed by a [*span control sequence], and arguments for that sequence like more styled-text. escapes can be nested. @@ -123,16 +132,16 @@ * underline {obj _|styled-text}: underlines the text. use sparingly on text intended for webpages -- underlined text [!is] distinct from links, but underlining non-links is still a violation of convention. * strikeout {obj ~|styled-text}: indicates that its text should be struck through or otherwise indicated for deletion * insertion {obj +|styled-text}: indicates that its text should be indicated as a new addition to the text body. ** consider using a macro definition [`\edit: [~[#1]][+[#2]]] to save typing if you are doing editing work * link \[>[!ref] [!styled-text]\]: produces a hyperlink or cross-reference denoted by [$ref], which may be either a URL specified with a reference or the name of an object like an image or section elsewhere in the document. the unicode characters [`→] and [`🔗] can also be used instead of [`>] to denote a link. -* footnote {span ^|ref|[$styled-text]}: annotates the text with a defined footnote. in interactive output media [`\[^citations.qtheo Quantum Theosophy: A Neophyte's Catechism]] will insert a link with the next [`Quantum Theosophy: A Neophyte's Catechism] that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found. +* footnote {span ^|ref|[$styled-text]}: annotates the text with a defined footnote. in interactive output media [`\[^citations.qtheo Quantum Theosophy: A Neophyte's Catechism]] will insert a link with the text [`Quantum Theosophy: A Neophyte's Catechism] that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found. * superscript {obj '|[$styled-text]} * subscript {obj ,|[$styled-text]} * raw {obj \\ |[$raw-text]}: causes all characters within to be interpreted literally, without expansion. the only special characters are square brackets, which must have a matching closing bracket, and backslashes. * raw literal \[$\\[!raw-text]\]: shorthand for [\[$[\…]]] -* macro [`\{[!name] [!arguments]\}]: invokes a [>ex.mac macro], specified with a reference +* macro [` \{[!name] [!arguments]}]: invokes a [>ex.mac macro], specified with a reference * argument {obj #|var}: in macros only, inserts the [$var]-th argument. otherwise, inserts a context variable provided by the renderer. * raw argument {obj ##|var}: like above, but does not evaluate [$var]. * term {obj &|name}, {span &|name|[$expansion]}: quotes a defined term with a link to its definition, optionally with a custom expansion of the term (for instance, to expand the first use of an acronym) * inline image {obj &@|name}: shows a small image or other object inline. the unicode character [`🖼] can also be used instead of [`&@]. * unicode codepoint {obj U+|hex-integer}: inserts an arbitrary UCS codepoint in the output, specified by [$hex-integer]. lowercase [`u] is also legal. @@ -160,15 +169,15 @@ any identifier (including a reference) that is defined within a named section must be referred to from outside that section as [`[!sec].[!obj]], where [$sec] is the ID of the containing section and [$obj] is the ID of the object one wishes to reference. ##rsrc resources a [!resource] represents content that is not encoded directly into the source file, but which is embedded by some means in the output. resources can either be [!embedded], in which case they are compiled into the final document itself, or they can be [!linked], in which case the final document only contains a URI or similar tag referencing the resource. not all render backends support both linking and embedding embedding, nor do all backends support all object types (for instance, [`groff] does not support video embedding.) -a resource definition is begun by line consisting of an [`@] sign and an [>ident identifier]. this line is followed by any number of parameters. a parameter is a line beginning with a single tab, a keyword, a colon, and a then a value. additional lines can be added to a parameter by following it with a line that consists of two tabs followed by the text you wish to add. (this is the same syntax used by references.) a resource definition is terminated by a break, or any line that does not begin with a tab +a resource definition is begun by line consisting of an [`@] sign and an [>ident identifier]. this line is followed by any number of parameters. a parameter is a line beginning with a single tab, a keyword, a colon, and a then a value. additional lines can be added to a parameter by following it with a line that consists of two tabs followed by the text you wish to add. (this is the same syntax used by references.) a resource definition is terminated by a break, or any line that does not begin with a tab a resource definition in use looks like this: -~~~ +~~~cortav this is a demonstration of resources @smiley src: link image/webp http://cdn.example.net/img/smile.webp link image/png file:img/smile.png embed image/gif file img/smile.gif @@ -178,11 +187,11 @@ &smiley ~~~ rendered as HTML, this might produce the following: -~~~ +~~~html