Index: cortav.ct ================================================================== --- cortav.ct +++ cortav.ct @@ -16,22 +16,22 @@ * [*headings]: cortav uses almost the same syntax for headings that markdown does, except it only allows the "ATX style" headings, with one or more hash characters at the start of the line. the only differences from markdown are: ** you can use the unicode section character [`§] instead of [`#] if you're feeling snobby ** you must put a space between the control sequence (the sequence of hashes or section symbols, in this case) and the title text. [`# title] creates a section with the heading text "title", but [`#title] creates a new section with no heading at all; instead, it gives the anonymous section the ID [`title]. and of course, you can combine the two: [`#ttl title] creates a section with the heading text "title" and the ID [`ttl]. what are IDs for? we'll get to that in a little bit * [*paragraphs] are mostly the same as in markdown, except that a paragraph break occurs after every newline character, not every blank line. paragraphs can be indented by however many spaces you like; such indentation will be ignored. (tabs have a special meaning, however). in cortav, you can also explicitly mark a line of text as a paragraph by preceding it with a period character ([`.]), which is useful if you want to start a paragraph with text that would otherwise be interpreted specially. -* [*italic text] -- or rather, [!emphasized] text -- is written as [`\[!my spiffy italic text\]]. in cortav, these spans can be nested within other spans (or titles, or table cells, or…), and the starting and ending point is unambiguous. -* [*bold text] -- or rather, [*strong] text -- is written as [`\[*my commanding bold text\]]. -* [*bold-italic text] -- or rather, [![*emphasized strong text]] -- has no specific notation. rather, you create it by nesting one span within the other, for instance: [`\[*[!my ostentatious bold-italic text\]]]. -* [*links] are quite different from their markdown equivalents. cortav does not have inline links, as it is intended to be easily readable in both formatted and plain-text format, and long URLs rather disrupt the flow of reading. rather, a link tag is written with the notation [`\[>nifty-link my nifty link\]], where the word [`nifty-link] immediately following the arrow is an [!identifier] indicating the destination of the link. (instead of a greater-than sign, you can also use the unicode arrow symbol [`→].) if the identifier is the same as one you've assigned to a document object, such as a section, cortav produces a link within the document to that object. otherwise, it will look for a [!reference] (or failing that, a [>rsrc resource]) to tell it the URI for the link. if nothing in the document matches the ID, an error will result and compilation will be aborted. (a reference is a key-value pair created by adding a line like [`nifty-link: https://zombo.com] [!indented by exactly one tab]. you can place this reference anywhere you like so long as it's in the same section; if you want to name a reference in another section, you have to prefix it with that section's ID, e.g. [`\[>spiffy-section.nifty-link my nifty link declared in a spiffy section\]].) +* [*italic text] -- or rather, [!emphasized] text -- is written as ["[!my spiffy italic text]]. in cortav, these spans can be nested within other spans (or titles, or table cells, or…), and the starting and ending point is unambiguous. +* [*bold text] -- or rather, [*strong] text -- is written as ["[*my commanding bold text]]. +* [*bold-italic text] -- or rather, [![*overemphasized text]] -- has no specific notation. rather, you create it by nesting one span within the other, for instance: ["[*[!my ostentatious bold-italic text]]]. +* [*links] are quite different from their markdown equivalents. cortav does not have inline links, as it is intended to be easily readable in both formatted and plain-text format, and long URLs rather disrupt the flow of reading. rather, a link tag is written with the notation ["[>nifty-link my nifty link]], where the word [`nifty-link] immediately following the arrow is an [!identifier] indicating the destination of the link. (instead of a greater-than sign, you can also use the unicode arrow symbol [`→].) if the identifier is the same as one you've assigned to a document object, such as a section, cortav produces a link within the document to that object. otherwise, it will look for a [!reference] (or failing that, a [>rsrc resource]) to tell it the URI for the link. if nothing in the document matches the ID, an error will result and compilation will be aborted. (a reference is a key-value pair created by adding a line like [`nifty-link: https://zombo.com] [!indented by exactly one tab]. you can place this reference anywhere you like so long as it's in the same section; if you want to name a reference in another section, you have to prefix it with that section's ID, e.g. ["[>spiffy-section.nifty-link my nifty link declared in a spiffy section]].) * [*lists] use a different syntax from markdown. you can start a line with a [`*] to create an unordered list, or [`:] to create an ordered list; indentation doesn't matter. if you want to nest list items, instead of putting two spaces before the child item, you just add another star or colon. and of course, you can nest lists of different kinds within one another. -* [*horizontal rules] use roughly the same syntax: three or more hyphens on a line of their own ([`\---]). underlines also work ([`___], [`-_-], [`__-__-__] etc), as do horizontal unicode box drawing characters ([`─ ━ ┈] etc). +* [*horizontal rules] use roughly the same syntax: three or more hyphens on a line of their own (["---]). underlines also work ([`___], [`-_-], [`__-__-__] etc), as do horizontal unicode box drawing characters ([`─ ━ ┈] etc). * some markdown implementations support [*tables]. cortav does too, using a very simple notation similar to the usual notation used in markdown. a key difference, however, is that cortav table cells can contain any formatting a paragraph can. -* [*underlines] are supported by some markdown implementations. in cortav, you can apply them with the notation [`\[_my underlined text\]] -- please just use them sparingly when you render to HTML! -* [*strikethrough] is supported by some extended versions of markdown. cortav uses the notation [`\[~my deleted text\]], with the intended semantics of text that is being removed by some revision of a document. (you can also denote text that is being [!added] by using a plus sign instead of a tilde) +* [*underlines] are supported by some markdown implementations. in cortav, you can apply them with the notation ["[_my underlined text]] -- please just use them sparingly when you render to HTML! +* [*strikethrough] is supported by some extended versions of markdown. cortav uses the notation ["[~my deleted text]], with the intended semantics of text that is being removed by some revision of a document. (you can also denote text that is being [!added] by using a plus sign instead of a tilde) * [*images] are a bit more complicated, but much more versatile. see the section on [>rsrc resources] for an explanation. -* [*smart quotes] and [*em dashes] are inserted automatically, just as in markdown, provided you have the [>tsmog transmogrify] extension available. (it is part of the reference implementation and defined by the spec, but not required.) in fact, you can insert longer dashes than em dashes just by increasing the number of hyphens. the reference implementation's transmogrifier also translates ascii arrows like [`\-->] into their unicode equivalents ([`→]). -* [*literals] (also known as [*code text]) can be inserted with the [`\[`int main(void);\]] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you're quoting makes use of such forbidden runes. in this case, you'll want to wrap the code span in a raw span. the syntax for this is [`\[`[\\int main(void);\]]], but since this is a bit unwieldy it can also be abbreviated as [`\[`\\int main(void);\]]. +* [*smart quotes] and [*em dashes] are inserted automatically, just as in markdown, provided you have the [>tsmog transmogrify] extension available. (it is part of the reference implementation and defined by the spec, but not required.) in fact, you can insert longer dashes than em dashes just by increasing the number of hyphens. the reference implementation's transmogrifier also translates ascii arrows like ["-->] into their unicode equivalents ([`→]). +* [*literals] (also known as [*code text]) can be inserted with the ["[`int main(void);]] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you're quoting makes use of such forbidden runes. in this case, you'll want to wrap the code span in a raw span. the syntax for this is ["[`[\\int main(void);]]], but since this is a bit of an unwieldy syntax for a common operation, it can also be abbreviated as ["["int main(void);]]. of course, this is only a small taste of what cortav can do, not even touching on key features like macros, footnotes, or equation formatting. read the sections on [>onblocks blocks] and [>onspans spans] for all the gory details. ## encoding a cortav document is made up of a sequence of codepoints. UTF-8 must be supported, but other encodings (such as UTF-32 or C6B) may be supported as well. lines will be derived by splitting the codepoints at the linefeed character or equivalent. note that unearthly encodings like C6B or EBCDIC will need to select their own control sequences. @@ -99,11 +99,15 @@ ** [`~~~ \[language\] #id ~~~] ** [`~~~ title ~~~] ** [`~~~ title \[language\] ~~~] ** [`~~~ \[language\] title ~~~] ** [`~~~ title \[language\] #id ~~~] -*[*reference] (tab): a line beginning with a tab is treated as a "reference." references hold out-of-line metadata for preceding text like links and footnotes. a reference consists of an identifier followed by a colon and an arbitrary number of spaces or tabs, followed by text. whether this text is interpreted as raw-text or styled-text depends on the context in which the reference is used. in encodings without tab characters, two preceding blanks can be used instead. +*[*definition] ([^def-ex tab]): a line [^def-tab-enc beginning with a tab] is a multipurpose metadata syntax. the tab may be followed by an identifier, a colon, and a value string, in which case it opens a new definition; alternatively, a second tab character turns the line into a [*definition continuation], adding the remaining characters as a new line to the definition value on the previous line. when a new definition is opened on a line immediately following certain kinds of objects, such as resource, it attaches key-value metadata to that object. when a definition is not preceded by such an object, an independent [*reference] is created instad. +** a [*reference] is a general mechanism for out-of-line metadata, and references are used in many different ways -- e.g. to specify link destinations, footnote contents, abbreviations, or macros. to ensure that a definition is interpreted as a reference, rather than as metadata for an object, precede it with a blank line. + def-tab-enc: in encodings without tab characters, a definition is opened by a line beginning with two blanks, and continued by a line beginning with four blanks. + def-ex: [*open a new reference]: [`[!\\t][$key]: [$value]] + [*continue a reference]: [`[!\\t\\t][$value]] * [*quotation] ([`<]): a line of the form [`<[$name]> [$quote]] denotes an utterance by [$name]. * [*blockquote] ([`>]): alternate blockquote syntax. can be nested by repeating the [`>] character. * [*subtitle/caption] ([`\--]): attaches a subtitle to the previous header, or caption to the previous object * [*embed] ([`&]): embeds a referenced object. can be used to show images or repeat previously defined objects like lists or tables, optionally with a caption. ** [`$[$macro] [$arg1]|[$arg2]|[$argn]…] invokes a block-level macro with the supplied arguments @@ -164,11 +168,51 @@ you can finish each row with a bar or plus character, but it's not necessary. only do it if you think it makes the source easier to read. * [>ex.tab an example of table notation] ##ident identifiers -any identifier (including a reference) that is defined within a named section must be referred to from outside that section as [`[!sec].[!obj]], where [$sec] is the ID of the containing section and [$obj] is the ID of the object one wishes to reference. +an identifier is a string which unambiguously names a section, block, reference, or other object of interest. every section has its own identifier namespace; to reference an object in one section from a different section, the identifier must be written as [`[$sec].[$obj]], where [$sec] is the ID of the containing section and [$obj] is the ID of the object one wishes to reference. subdocuments (such as blockquotes or resources of type [`text/x.cortav]) have their own namespace collection, so an object defined within e.g. a blockquote will not escape to the enclosing context; however, subdocuments can reference objects from the containing document in the usual fashion. + +identifiers can be composed through interpolation in macro expansions. for instance, the macro expansion +~~~cortav + xref: (see [>link-[#1] [#2]] by [#3]) +the 25,953CE accession of the Hyperion Entity to the Throne Unyielding is now widely considered by the collective of ascended masters to have been fraudulent {xref disc-artax|Discursus Immundus on the Immaterial Doctrines of Redemption & Liquidation|Hierophant Artaxerxes MXIV}, but at the time was received with the near-unanimous adulation of the Manifold Hierophanies. an early dissenting voice, the Kakistarch Philomene Adumbratio of Forbidden Zone 969, +~~~ +is equivalent to +~~~cortav +the 25,953CE accession of the Hyperion Entity to the Throne Unyielding is now widely considered by the collective of ascended masters to have been fraudulent (see [>link-disc-artax Discursus Immundus on the Immaterial Doctrines of Redemption & Liquidation] by Hierophant Artaxerxes MXIV), but at the time was received with the near-unanimous adulation of the Manifold Hierophanies. an early dissenting voice, the Kakistarch Philomene Adumbratio of Forbidden Zone 969, +~~~ +identifiers dereferenced through macro expansions which lack an explicit section prefix are first evaluated in the context of the section [!in which the macro was defined], rather than the section in which it was expanded. the latter is only searched if the definition section has no object with a matching identifier. this behavior, while useful, is not always desirable. to force the resulting identifier (whether composed through interpolation or written out explicitly) to be evaluated in the context of the macro expansion, prefix it with a period ([`.]) to form an [*expansion-site identifier]. for example: +~~~cortav +#alpha section alpha + link: http://example.net + macro-plain-id: [>link link to example.net] + macro-expsite-id: [>.link link to section-dependent destination] + +here are links to example.net: +* {macro-plain-id} +* {macro-expsite-id} +* {beta.macro-expsite-id} + +here are links to both sites: +* {macro-plain-id} [%% example.net] +* {beta.macro-plain-id} [%% zombo com] + +#beta section beta + link: http://zombo.com + macro-plain-id: [>link link to zombo com] + macro-expsite-id: [>.link link to some website somewhere] + +here are links to zombo com: +* {macro-plain-id} +* {macro-expsite-id} +* {alpha.macro-expsite-id} + +here are links to both sites: +* {macro-plain-id} [%% zombo com] +* {alpha.macro-plain-id} [%% example.net] +~~~ ##rsrc resources a [!resource] represents content that is not encoded directly into the source file, but which is embedded by some means in the output. resources can either be [!embedded], in which case they are compiled into the final document itself, or they can be [!linked], in which case the final document only contains a URI or similar tag referencing the resource. not all render backends support both linking and embedding embedding, nor do all backends support all object types (for instance, [`groff] does not support video embedding.) a resource definition is begun by line consisting of an [`@] sign and an [>ident identifier]. this line is followed by any number of parameters. a parameter is a line beginning with a single tab, a keyword, a colon, and a then a value. additional lines can be added to a parameter by following it with a line that consists of two tabs followed by the text you wish to add. (this is the same syntax used by references.) a resource definition is terminated by a break, or any line that does not begin with a tab @@ -294,25 +338,26 @@ ~~~ inline resources can also offer a cleaner syntax for complex multiline macros. ~~~cortav @def text/x.cortav { - * [*[#1]] [!([#2]) + * [*[#1]] [!([#2])] *: [#3] } &def nuclear bunker|n|that which will not protect you from the Smiling Man ~~~ to make this usage simpler, resources with a type of [`text/x.cortav] can omit the MIME type field. inline resources are a great way to extend cortav with implementation-dependent features. say you want mathtex in your cortav renderer -- all you have to do is support a new MIME type [`text/x.mathtex], and then the users can embed their math equations like so: ~~~cortav -and as we see from the value of κ below, Bose-Fleischer-Kincaid entities of Carlyle subtype γ lack interaction with the putative "Higgs field" of Athabaskan Windchime Theory, seemingly ruling out any possibility of direct interaction with the spacetime metric, and consequently removing the maximal density "limitations" that exist for bosonic matter. +and as we see from the value of κ below, Bose-Fleischer-Kincaid entities of Carlyle subtype γ do not interact at all with the putative "Higgs field" of Athabaskan Windchime Theory, seemingly ruling out any distortion of the spacetime metric, and consequently removing the maximal density parameter that is defined for bosonic matter. @ text/x.mathtex {> %% divide subtract differentiate blah blah blah i don't know math <} -given the selective cross-interaction of γ-BFKs, we conclude that, under the prevailing cosmocelestial paradigm, the answer to the age-old question of how many angels can dance on the head of a pin is "as many as would like to." +given the selective cross-interaction of γ-BFKs, we conclude that, under the prevailing cosmocelestial paradigm, the answer to the age-old question of how many angels can dance on the head of a pin is [^assump "as many as would like to"] + assump: assuming a perfectly spherical angel in a vacuum ~~~ ### supported parameters * [`src] (all): specifies where to find the file, what it is, and how to embed it. each line of [`src] should consist of two whitespace-separated words: MIME type and URI. the specification can also be prefixed with an extra word, [`auto], [`link], or [`embed], to control how the resource will be referenced from the output file. ** reference mode: the optional first word; if the requested reference mode is not applicable or valid for the output format or URI given, the source line will be skipped over. @@ -331,11 +376,12 @@ *** [`application/x-troff] can be used to supply sections of text written in raw [`groff] syntax. these are ignored by other renderers. *** [`text/html] can be used to supply sections of text written in raw HTML. these are ignored by non-HTML outputs. *** any MIME-type that matches the type of file being generated by the renderer can be used to include a block of data that will be passed directly to the renderer. ** URI types: additional URI types can be added by extensions or different implementations, but every compliant implementation must support these URIs. *** [`http], [`https]/[`http+tls]: accesses resources over HTTP. add a [`file] fallback if possible for the benefit of renderers/viewers that do not have internet access abilities. -*** [`file]: references local files. (the meaning of "local" varies depending on the translation format.) absolute paths should begin [`file:/]; the slash should be omitted for relative paths. note that this doesn't have quite the same meaning as in HTML -- [`file] can (and usually should be) used with HTML outputs to refer to resources that reside on the same server. a cortav URI of [`file:/etc/passwd] will actually result in the link [`/etc/passwd], not [`file:///etc/passwd] when converted to HTML. generally, you only should use [`http] when you're referring to a resource that exists on a different domain. +*** [`file]: references local files. (the meaning of "local" varies depending on the translation format.) absolute paths should begin [`file:/]; the slash should be omitted for relative paths. note that this doesn't have quite the same meaning as in HTML -- [`file] can (and usually should be) used with HTML outputs to refer to resources that reside on the same server. a cortav URI of [`file:/etc/passwd] will actually result in the link [`/etc/passwd], not [`file:///etc/passwd] when converted to HTML. generally, you only should use [`http] when you're referring to a resource that exists on a different domain. on systems where text and binary files are handled differently, the URIs [`file+txt:] and [`file+bin:] can be used to specify an opening mode. +*** [`asset]: identical to file [`file], except that paths are interpreted relative to the asset base (the parent directory of the source file if not otherwise defined), rather than the current working directory of the [`cortav] translator process. *** [`name]: a special URI used generally for referencing resources that are already installed on a target system and do not need to be embedded or linked, the name and type are enough for a renderer on another machine to locate the correct resource. this is useful mostly for [>fonts fonts], where it's more typical to refer to fonts that are installed on your system rather than providing paths to font files. *** [`gemini]: accesses resources over the gemini protocol. currently you should really only use this for [`embed] resources unless you're using the gemtext renderer backend, since nothing but gemini browsers are liable to support this protocol. *** [`role]: specifies an abstract resource determined by context, e.g. [`role:backdrop], [`role:body-font]. for use by translators to formats which make provisions for viewer control. a [`role] URI is special in that it is never embedded; it always depends on context — user preferences, environment variables, system stylesheets, what have you — at the time the output file is viewed, rather than the time of the input file being rendered. * [`desc]: supplies a narrative description of the resources, for use as an "alt-text" when the image cannot be loaded and for screenreaders. * [`detail]: supplies extra narrative commentary that is displayed contextually, e.g. when the user hovers her mouse cursor over the embedded object. also used for [`desc] if [`desc] is not supplied. Index: cortav.lua ================================================================== --- cortav.lua +++ cortav.lua @@ -123,18 +123,53 @@ ctx.doc.src = src ctx.sec = doc:mksec() -- toplevel section ctx.sec.origin = ctx:clone() end; ref = function(self,id) + if self.invocation then + -- allow IDs to contain template substitutions by mimicking the [#n] syntax + id = id:gsub('%b[]', function(sp) + -- should indirection be allowed here? TODO + if sp:sub(2,2) == '#' then + local n = tonumber(sp:sub(3,-2)) + if n == nil then + self:fail('invalid template substitution “%s” in ID “%s”', sp, id) + end + local arg = self.invocation.args[n] + if arg == nil then + self:fail('template instantiation requires at least %u arguments (in ID “%s”)',n,id) + end + return arg + else return sp end + end) + + end if not id:find'%.' then local rid = self.sec.refs[id] - if self.sec.refs[id] then - return self.sec.refs[id], id, self.sec - else self:fail("no such ref %s in current section", id or '') end + if rid then + return rid, id, self.sec + end + + --nothing in the current section, but this ID could be looked up in the context of a macro expansion. if so, check section of the site of invocation as well + if self.invocation then + rid = self.invocation.origin:ref(id) + if rid then + return rid, id, self.invocation.origin.sec + end + end + + self:fail("no such ref %s in current section", id or '') else local sec, ref = string.match(id, "(.-)%.(.+)") - local s = self.doc.sections[sec] + local s + if sec == '' then + if self.invocation == nil then + self:fail('site-of-invocation IDs can only be dereferenced in a macro expansion (offending ID: “%s”)', id) + end + s = self.invocation.origin.sec + end + s = s or self.doc.sections[sec] if not s then -- fall back on inheritance tree for i, p in ipairs(self.doc.parents) do if p.sections[sec] then s = p.sections[sec] break @@ -1297,19 +1332,23 @@ macro = id; args = argv; } end)}; {seq='&', fn=blockwrap(function(s,c) - local id, cap = s:match('^&([^%s]+)%s*(.-)%s*$') + local mode, id, cap = s:match('^&([-+]?)([^%s]+)%s*(.-)%s*$') if id == nil or id == '' then c:fail 'malformed embed block' end - if cap == '' then cap = nil end + if cap == '' then cap = nil end + if mode == '-' then mode = 'closed' + elseif mode == '+' then mode = 'open' + else mode = 'inline' end return { kind = 'embed'; ref = id; cap = cap; + mode = mode; } end)}; {fn = insert_paragraph}; } Index: render/html.lua ================================================================== --- render/html.lua +++ render/html.lua @@ -867,12 +867,15 @@ if next(ctr.nodes) == nil then idx = 1 fbimg = { elt = 'img'; --fallback attrs = { - alt = ''; + alt = obj.props.desc or obj.props.detail or ''; + title = obj.props.detail; src = uri; + width = obj.props.width; + height = obj.props.height; }; } else idx = #ctr.nodes end table.insert(ctr.nodes, idx, { elt = 'source'; --fallback @@ -971,11 +974,32 @@ if rtype[1] < src.mime then rtype[2](src, top) end end local ft = flatten(top) - return ft + local cap = b.cap or obj.props.desc or obj.props.detail + if b.mode == 'inline' then + -- TODO insert caption + return ft + else + local prop = {} + if b.mode == 'open' then + prop.open = true + end + return tag('details', prop, catenate { + tag('summary', {}, + cap and ( + -- the block here should really be the relevant + -- ref definition if an override caption isn't + -- specified, but oh well + sr.htmlSpan(spanparse( + cap, b.origin + ), b, s) + ) or ''); + ft; + }) + end end function block_renderers.macro(b,s) local all = renderSubdoc(b.doc) local cat = catenate(ss.map(flatten,all)) Index: sirsem.lua ================================================================== --- sirsem.lua +++ sirsem.lua @@ -485,11 +485,10 @@ local triple = {string.byte(str, i, i+2)} local T = function(q) return triple[q] or 0 end local B = function(q) - print(q) if q <= 25 then return string.char(0x41 + q) elseif q <= 51 then return string.char(0x61 + (q-26)) elseif q <= 61 then @@ -1476,11 +1475,10 @@ for k,v in pairs(mimeclasses) do if me > ss.mime(k) then c = v break end end - print(c) return c == pc end; }; } ss.mime.exn = ss.exnkind 'MIME error'