cortav  Check-in [e551f71321]

Overview
Comment:cleanup
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: e551f7132141cd8e20431a394b8b043ab1e3b5cae2183bf57b3cb800cd447a3a
User & Date: lexi on 2022-09-05 20:15:08
Other Links: manifest | tags
Context
2022-09-05
20:46
fix typo check-in: c30f235b93 user: lexi tags: trunk
20:15
cleanup check-in: e551f71321 user: lexi tags: trunk
18:49
add blockquote support for html, subdocument mechanisms, mode to generate epub-compatible XHTML5; various fixes and improvements check-in: 35ea3c5797 user: lexi tags: trunk
Changes

Modified cortav.ct from [03a705bddd] to [1f89ec5932].

   129    129   * custom style {span .|id|[$styled-text]}: applies a specially defined font style. for example, if you have defined [`caution] to mean "demibold italic underline", cortav will try to apply the proper weight and styling within the constraints of the current font to the span [$styled-text]. see the [>fonts-sty fonts section] for more information about this mechanism.
   130    130   * literal {obj `|styled-text}: indicates that its text is a reference to a literal sequence of characters or other discrete token. generally rendered in monospace
   131    131   * variable {obj $|styled-text}: indicates to the reader that its text is a placeholder, rather than a literal representation. generally rendered in italic monospace, ideally of a different color
   132    132   * underline {obj _|styled-text}: underlines the text. use sparingly on text intended for webpages -- underlined text  [!is] distinct from links, but underlining non-links is still a violation of convention.
   133    133   * strikeout {obj ~|styled-text}: indicates that its text should be struck through or otherwise indicated for deletion
   134    134   * insertion {obj +|styled-text}: indicates that its text should be indicated as a new addition to the text body.
   135    135   ** consider using a macro definition [`\edit: [~[#1]][+[#2]]] to save typing if you are doing editing work
   136         -* link \[>[!ref] [!styled-text]\]: produces a hyperlink or cross-reference denoted by [$ref], which may be either a URL specified with a reference or the name of an object like an image or section elsewhere in the document. the unicode characters [`→] and [`🔗] can also be used instead of [`>] to denote a link.
          136  +* link [` \[>[$ref] [$styled-text]\]]: produces a hyperlink or cross-reference denoted by [$ref], which may be either a URL specified with a reference or the name of an object like an image or section elsewhere in the document. the unicode characters [`→] and [`🔗] can also be used instead of [`>] to denote a link.
   137    137   * footnote {span ^|ref|[$styled-text]}: annotates the text with a defined footnote. in interactive output media [`\[^citations.qtheo Quantum Theosophy: A Neophyte's Catechism\]] will insert a link with the text [`Quantum Theosophy: A Neophyte's Catechism] that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found.
   138    138   * superscript {obj '|[$styled-text]}
   139    139   * subscript {obj ,|[$styled-text]}
   140    140   * raw {obj \\ |[$raw-text]}: causes all characters within to be interpreted literally, without expansion. the only special characters are square brackets, which must have a matching closing bracket, and backslashes.
   141         -* raw literal \[$\\[!raw-text]\]: shorthand for [\[$[\…]]]
   142         -* macro [` \{[!name] [!arguments]}]: invokes a [>ex.mac macro], specified with a reference
          141  +* raw literal [` \["[$raw-text]\]]: shorthand for a raw inside a literal, that is ["[`[\\…]]]
          142  +* macro [` \{[$name] [$arguments]}]: invokes a [>ex.mac macro], specified with a reference
   143    143   * argument {obj #|var}: in macros only, inserts the [$var]-th argument. otherwise, inserts a context variable provided by the renderer.
   144    144   * raw argument {obj ##|var}: like above, but does not evaluate [$var].
   145    145   * term {obj &|name}, {span &|name|[$expansion]}: quotes a defined term with a link to its definition, optionally with a custom expansion of the term (for instance, to expand the first use of an acronym)
   146    146   * inline image {obj &@|name}: shows a small image or other object inline. the unicode character [`🖼] can also be used instead of [`&@].
   147    147   * unicode codepoint {obj U+|hex-integer}: inserts an arbitrary UCS codepoint in the output, specified by [$hex-integer]. lowercase [`u] is also legal.
   148    148   * math mode {obj =|equation}: activates additional transformations on the span to format it as a mathematical equation; e.g. [`*] becomes [`×] and [`/] --> [`÷].
   149         -* extension {span %|ext|…}: invokes extension named in [$ext]. [$ext] will usually be an extension name followed by a symbol (often a period) and then an extension-specific directive, although for some simple extensions it may just be the plain extension name. further syntax and semantics depend on the extension. this syntax can also be used to apply formatting specific to certain renderers, such as assigning a CSS class in the [`html] renderer ([`\[%html.myclass my [!styled] text]]).
          149  +* extension {span %|ext|…}: invokes extension named in [$ext]. [$ext] will usually be an extension name followed by a symbol (often a period) and then an extension-specific directive, although for some simple extensions it may just be the plain extension name. further syntax and semantics depend on the extension. this syntax can also be used to apply formatting specific to certain renderers, such as assigning a CSS class in the [`html] renderer (["[%html.myclass my [!styled] text]]).
   150    150   * critical extension {span %!|ext|…}: like [!extension], but will trigger an error if the requested extension is not available
   151         -* extension text {span %:|ext|styled-text}: like [!extension], but when the requested extension is not present, [$styled-text] wlil be emitted as-is. this is a better way to apply CSS classes, as the text will still be visible when rendered to formats other than HTML.
          151  +* extension text {span %:|ext|[$styled-text]}: like [!extension], but when the requested extension is not present, [$styled-text] wlil be emitted as-is. this is a better way to apply CSS classes, as the text will still be visible when rendered to formats other than HTML.
   152    152   * inline comment {obj %%|...}: ignored. useful for editorial annotations not intended to be part of the rendered product.
   153    153   
   154    154   	span: [` \[[*[#1]][$[#2]] [#3]\]]
   155    155   	obj: [` \[[*[#1]][$[#2]]\]]
   156    156   
   157    157   ##tabs tables
   158    158   tables are encoded using a very simple notation. any line that begins with a plus [`+] or bar [`|] denotes a table row. each plus or bar separates one column from the other: a plus opens a new header cell, a bar opens a new normal cell.
................................................................................
   176    176   a resource definition in use looks like this:
   177    177   
   178    178   ~~~cortav
   179    179   this is a demonstration of resources
   180    180   @smiley
   181    181   	src: link image/webp http://cdn.example.net/img/smile.webp
   182    182   		  link image/png file:img/smile.png
   183         -		  embed image/gif file img/smile.gif
          183  +		  embed image/gif file:img/smile.gif
   184    184   	desc: the Smiling Man would like to see you in his office
   185    185   here is the resource in span context [&smiley]
   186    186   and here it is in block context:
   187    187   &smiley
   188    188   ~~~
   189    189   
   190    190   rendered as HTML, this might produce the following:
................................................................................
   245    245   %% (except that the last wil require embedding)
   246    246   ~~~
   247    247   
   248    248   inline resources are defined a bit differently:
   249    249   
   250    250   ~~~cortav
   251    251   @smiling-man-business-card text/plain {
   252         -	THE SMILING MAN  | tel. 0-Ω00-666█
   253         -	if you can read this | email: nameless@smiles.gov
          252  +	THE SMILING MAN        | tel. 0-Ω00-666█
          253  +	if you can read this   | email: nameless@smiles.gov
   254    254   	it is already too late | address: right behind you
   255    255   }
   256    256   @smiling-man-business-card image/png;base64 {
   257    257   	%% incomprehensible gibbering redacted
   258    258   }
   259    259   ~~~
   260    260   
   261         -for an inline resource, the identifier is followed by a MIME type and an opening bracket. the opening bracket may be any of the characters [`\{][`\[][`(][`<], and can optionally be followed by additional characters to help disambiguate the closing bracket. the closing bracket is determined by "flipping" the opening bracket, producing bracket pairs like the following:
   262         -* [`\{:][`:}]
   263         -* [`\<!--] [`\--!>]
          261  +for an inline resource, the identifier is followed by a MIME type and an opening bracket. the opening bracket may be any of the characters ["{] ["\[] ["(] ["<], and can optionally be followed by additional characters to help disambiguate the closing bracket. the closing bracket is determined by "flipping" the opening bracket, producing bracket pairs like the following:
          262  +* ["{:][`:}]
          263  +* ["<!--] ["--!>]
   264    264   * [`(*<][`>*)]
   265    265   * [`<>][`<>] [!(disables nesting!)]
   266         -if the open and closing brackets are distinguishable, they will nest appropriately, meaning that [`\{][`\}] alone is very likely to be a safe choice to escape a syntactically correct C program (that doesn't abuse macros too badly). brackets are searched for during parsing; encoded resources are not decoded until a later stage, so a closing bracket character in a base64-encoded text file cannot break out of its escaping.
          266  +if the open and closing brackets are distinguishable, they will nest appropriately, meaning that ["{]["}] alone is very likely to be a safe choice to escape a syntactically correct C program (that doesn't abuse macros too badly). brackets are searched for during parsing; encoded resources are not decoded until a later stage, so a closing bracket character in a base64-encoded text file cannot break out of its escaping.
   267    267   
   268    268   as a convenience, if the first line of the resource definition begins with a single tab, one tab will be dropped from every following line in order to allow legible indentation. similarly, if an opening bracket is followed immediately by a newline, this newline is discarded.
   269    269   
   270    270   text within a resource definition body is not expanded unless the resource definition is preceded with an [`%[*expand]] directive or the resource MIME type is [`text/x.cortav]. if an expand directive is found, the MIME type will be used to try and determine an appropriate type of formatting, potentially invoking a separate renderer. for example, [`text/html] will invoke the [`html] backend, and [`application/x-troff] will invoke the [`groff] backend. if no suitable renderer is available, expansions will generate only plain text.
   271    271   
   272         -two suffixes are accepted: [`;base64] and [`;hex]. the former will decode the presented strings using the base64 algorithm to obtain the resources data; the second will ignore all characters but ASCII hexadecimal digits and derive the resource data byte-by-byte by reading in hexadecimal pairs. for instance, the following sections are equivalent:
          272  +two suffixes are accepted: [`;base64] and [`;hex]. the former will decode the presented strings using the base64 algorithm to obtain the resource's data; the second will ignore all characters but ASCII hexadecimal digits and derive the resource data byte-by-byte by reading in hexadecimal pairs. for instance, the following sections are equivalent:
   273    273   
   274    274   ~~~cortav
   275    275   @propaganda text/plain {
   276    276   	WORLDGOV SAYS
   277    277   	“don't waste time with unproductive thoughts
   278    278   	 your wages will be docked accordingly”
   279    279   }
................................................................................
   331    331   *** [`application/x-troff] can be used to supply sections of text written in raw [`groff] syntax. these are ignored by other renderers.
   332    332   *** [`text/html] can be used to supply sections of text written in raw HTML. these are ignored by non-HTML outputs.
   333    333   *** any MIME-type that matches the type of file being generated by the renderer can be used to include a block of data that will be passed directly to the renderer.
   334    334   ** URI types: additional URI types can be added by extensions or different implementations, but every compliant implementation must support these URIs.
   335    335   *** [`http], [`https]/[`http+tls]: accesses resources over HTTP. add a [`file] fallback if possible for the benefit of renderers/viewers that do not have internet access abilities.
   336    336   *** [`file]: references local files. (the meaning of "local" varies depending on the translation format.) absolute paths should begin [`file:/]; the slash should be omitted for relative paths. note that this doesn't have quite the same meaning as in HTML -- [`file] can (and usually should be) used with HTML outputs to refer to resources that reside on the same server. a cortav URI of [`file:/etc/passwd] will actually result in the link [`/etc/passwd], not [`file:///etc/passwd] when converted to HTML. generally, you only should use [`http] when you're referring to a resource that exists on a different domain.
   337    337   *** [`name]: a special URI used generally for referencing resources that are already installed on a target system and do not need to be embedded or linked, the name and type are enough for a renderer on another machine to locate the correct resource. this is useful mostly for [>fonts fonts], where it's more typical to refer to fonts that are installed on your system rather than providing paths to font files.
   338         -*** [`gemini]: accesses resources over the gemini protocol. currently you should really only use this for [`local] resources unless you're using the gemtext renderer backend, since nothing but gemini browsers are liable to support this protocol.
          338  +*** [`gemini]: accesses resources over the gemini protocol. currently you should really only use this for [`embed] resources unless you're using the gemtext renderer backend, since nothing but gemini browsers are liable to support this protocol.
   339    339   *** [`role]: specifies an abstract resource determined by context, e.g. [`role:backdrop], [`role:body-font]. for use by translators to formats which make provisions for viewer control. a [`role] URI is special in that it is never embedded; it always depends on context — user preferences, environment variables, system stylesheets, what have you — at the time the output file is viewed, rather than the time of the input file being rendered.
   340    340   * [`desc]: supplies a narrative description of the resources, for use as an "alt-text" when the image cannot be loaded and for screenreaders.
   341    341   * [`detail]: supplies extra narrative commentary that is displayed contextually, e.g. when the user hovers her mouse cursor over the embedded object. also used for [`desc] if [`desc] is not supplied.
   342    342   
   343    343   note that in certain cases, full MIME types do not need to be used. say you're defining a font with the [`name] URI -- you can't necessary know what file type the system fonts on another computer are going to be. in this case, you can just write [`font] instead of [`font/ttf] or [`font/woff2] or similar. all cortav needs to know in this case is what abstract kind of object you're referencing. [`groff] fonts (referenced with the [`dit] URI) don't have a specific MIME type either.
   344    344   
   345    345   
................................................................................
   568    568   ~~~
   569    569   
   570    570   ~~~ tables #tab [cortav] ~~~
   571    571   here is a glossary table.
   572    572   
   573    573   + english :+ ranuir + zia ţai  + thaliste        +
   574    574   | honor   :| tef    | pang     | mbecheve        |
   575         -| rakewym :| hirvag | hi phang | nache umwelinde |
          575  +| rakewyrm:| hirvag | hi phang | nache umwelinde |
   576    576   | eat     :| fese   | dzia     | rotechqa        |
   577    577   
   578    578   and now the other way around!
   579    579   
   580    580   +:english  :| honor |
   581    581   +:ranuir   :| tef   |
   582    582   +:zia ţai  :| pang  |
................................................................................
   608    608   ** the [*groff] render backend ignores [$id]
   609    609   
   610    610   ###tsmog transmogrify
   611    611   a cortav renderer may automatically translate punctuation marks or symbol sequences to superior representations depending on their context. to be compliant this extension should implement, at minimum:
   612    612   * smart quotes (with consideration for the typographical conventions languages like German or Spanish)
   613    613   ** {dir.d transmogrify|language [$lang]} can be used to explicitly set the language; otherwise, it must be determined from the value of {dir.d pragma|lang}. if this is not present, implementations may fall back on their own methods for determining the language in use, such as command-line flags.
   614    614   * multigraph to glyph conversion, including at least:
   615         -** [`\--] --> "—"
   616         -** [`\-->] --> "→"
   617         -** [`\<--] -->  "←"
          615  +** ["--] --> "—"
          616  +** ["-->] --> "→"
          617  +** ["<--] -->  "←"
   618    618   
   619         -an escape character before any of the sequence characters should prevent the sequence from being rendered. raw nodes (that is, [`\[\…\]] and [`\[`\…\]]) should not be scanned for transmogrification, nor should the contents of code blocks unless marked with the [`%[*expand]] directive
          619  +an escape character before any of the sequence characters should prevent the sequence from being rendered. raw nodes (that is, ["[\…]] and ["["…]]) should not be scanned for transmogrification, nor should the contents of code blocks unless marked with the [`%[*expand]] directive
   620    620   
   621    621   transmogrification shall only take place after all other parsing steps are completed.
   622    622   
   623    623   ###hilite hilite
   624    624   code can be highlighted according to the formal language it is written in. a compliant hilite implementation must implement basic keyword, symbol, comment, pragma, and literal highlighing for the following formal languages.
   625    625   * C
   626    626   * [>lua Lua]

Modified cortav.lua from [18c311a386] to [70c60d1282].

   663    663   		}
   664    664   	end
   665    665   	ct.spanctls = {
   666    666   		{seq = '!', parse = formatter 'emph'};
   667    667   		{seq = '*', parse = formatter 'strong'};
   668    668   		{seq = '~', parse = formatter 'strike'};
   669    669   		{seq = '+', parse = formatter 'insert'};
   670         -		{seq = '`\\', parse = rawcode};
   671         -		{seq = '\\\\', parse = rawcode};
          670  +		{seq = '"', parse = rawcode};
          671  +		-- deprecated
          672  +			{seq = '`\\', parse = rawcode};
          673  +			{seq = '\\\\', parse = rawcode};
   672    674   		{seq = '\\', parse = function(s, c) -- raw
   673    675   			return {
   674    676   				kind = 'raw';
   675    677   				spans = {s};
   676    678   				origin = c:clone();
   677    679   			}
   678    680   		end};