Index: cortav.ct ================================================================== --- cortav.ct +++ cortav.ct @@ -27,11 +27,11 @@ * some markdown implementations support [*tables]. cortav does too, using a very simple notation similar to the usual notation used in markdown. a key difference, however, is that cortav table cells can contain any formatting a paragraph can. * [*underlines] are supported by some markdown implementations. in cortav, you can apply them with the notation [`\[_my underlined text\]] -- please just use them sparingly when you render to HTML! * [*strikethrough] is supported by some extended versions of markdown. cortav uses the notation [`\[~my deleted text\]], with the intended semantics of text that is being removed by some revision of a document. (you can also denote text that is being [!added] by using a plus sign instead of a tilde) * [*images] are a bit more complicated, but much more versatile. see the section on [>rsrc resources] for an explanation. * [*smart quotes] and [*em dashes] are inserted automatically, just as in markdown, provided you have the [>tsmog transmogrify] extension available. (it is part of the reference implementation and defined by the spec, but not required.) in fact, you can insert longer dashes than em dashes just by increasing the number of hyphens. the reference implementation's transmogrifier also translates ascii arrows like [`\-->] into their unicode equivalents ([`→]). -* [*literals] (also known as [*code text]) can be inserted with the [`\[`int main(void);] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you're quoting makes use of such forbidden runes. in this case, you'll want to wrap the code span in a raw span. the syntax for this is [`\[`[\\int main(void);\]]], but since this is a bit unwieldy it can also be abbreviated as [`\[`\\int main(void);\]]. +* [*literals] (also known as [*code text]) can be inserted with the [`\[`int main(void);\]] syntax. note however that literals are not protected from the transmogrifier, and are parsed like any other span, which may cause problems if the source code you're quoting makes use of such forbidden runes. in this case, you'll want to wrap the code span in a raw span. the syntax for this is [`\[`[\\int main(void);\]]], but since this is a bit unwieldy it can also be abbreviated as [`\[`\\int main(void);\]]. of course, this is only a small taste of what cortav can do, not even touching on key features like macros, footnotes, or equation formatting. read the sections on [>onblocks blocks] and [>onspans spans] for all the gory details. ## encoding a cortav document is made up of a sequence of codepoints. UTF-8 must be supported, but other encodings (such as UTF-32 or C6B) may be supported as well. lines will be derived by splitting the codepoints at the linefeed character or equivalent. note that unearthly encodings like C6B or EBCDIC will need to select their own control sequences. @@ -78,18 +78,18 @@ cortav is based on an HTML-like block model, where a document consists of sections, which are made up of blocks, which may contain a sequence of spans. flows of text are automatically conjoined into spans, and blocks are separated by one or more newlines. this means that, unlike in markdown, a single logical paragraph [*cannot] span multiple ASCII lines. the primary purpose of this was to ensure ease of parsing, but also, both markdown and cortav are supposed to be readable from within a plain text editor. this is the 21st century. every reasonable text editor supports soft word wrap, and if yours doesn't, that's entirely your own damn fault. hard-wrapping lines is incredibly user-hostile, especially to users on mobile devices with small screens. cortav does not allow it. the first character(s) of every line (the "control sequence") indicates the role of that line. if no control sequence is recognized, the line is treated as a paragraph. the currently supported control sequences are listed below. some control sequences have alternate forms, in order to support modern, readable unicode characters as well as plain ascii text. * [*paragraphs] ([`.] [` ¶] [`❡]): a paragraph is a simple block of text. the period control sequence is only necessary if the paragraph text starts with text that would be interpreted as a control sequence otherwise -* newlines [` \\]: inserts a line break into previous paragraph and attaches the following text. mostly useful for poetry or lyrics +* [*newlines] [` \\]: inserts a line break into previous paragraph and attaches the following text. mostly useful for poetry or lyrics * [*section starts] [`#] [`§]: starts a new section. all sections have an associated depth, determined by the number of sequence repetitions (e.g. "###" indicates depth three). sections may have headers and IDs; both are optional. IDs, if present, are a sequence of raw-text immediately following the hash marks. if the line has one or more space character followed by styled-text, a header will be attached. the character immediately following the hashes can specify a particular type of section. e.g.: ** [`#] is a simple section break. ** [`#anchor] opens a new section with the ID [`anchor]. ** [`# header] opens a new section with the title "header". ** [`#anchor header] opens a new section with both the ID [`anchor] and the title "header". * [*nonprinting sections] ([`^]): sometimes, you'll want to create a namespace without actually adding a visible new section to the document. you can achieve this by creating a [!nonprinting section] and defining resources within it. nonprinting sections can also be used to store comments, notes, to-dos, or other meta-information that is useful to have in the source file without it becoming a part of the output. nonprinting sections can be used for a sort of "literate markup," where resource and reference definitions can intermingle with human-readable narrative about those definitions. -* [*resource] ([`@]): defines a [!resource]. a resource is a file or object that exists outside of the document but which will are to be included in the document somehow. common examples of resources include images, videos, iframes, or headers/footers. see [>rsrc resources] for more information. +* [*resource] ([`@]): defines a [!resource]. a resource is a file or object that is to be embedded in the document somehow. common examples of resources include images, videos, iframes, or headers/footers. resources can be defined inline, or reference external objects. see [>rsrc resources] for more information. * [*lists] ([`*] [`:]): these are like paragraph nodes, but list nodes that occur next to each other will be arranged so as to show they compose a sequence. depth is determined by the number of stars/colons. like headers, a list entry may have an ID that can be used to refer back to it; it is indicated in the same way. if colons are used, this indicates that the order of the items is signifiant. [`:]-lists and [`*]-lists may be intermixed; however, note than only the last character in the sequence actually controls the type. a blank line terminates the current list. * [*directives] ([`%]): a directive issues a hint to the renderer in the form of an arbitrary string. directives are normally ignored if they are not supported, but you may cause a warning to be emitted where the directive is not supported with [`%!] or mark a directive critical with [`%!!] so that rendering will entirely fail if it cannot be obeyed. * [*comments] ([`%%]): a comment is a line of text that is simply ignored by the renderer. * [*asides] ([`!]): indicates text that diverges from the narrative, and can be skipped without interrupting it. think of it like block-level parentheses. asides which follow one another are merged as paragraphs of the same aside, usually represented as a sort of box. if the first line of an aside contains a colon, the stretch of styled-text from the beginning to the aside to the colon will be treated as a "type heading," e.g. "Warning:" * [*code] ([`~~~]): a line beginning with ~~~ begins or terminates a block of code. code blocks are by default not parsed, but parsing can be activated by preceding the code block with an [`%[*expand]] directive. the opening line should look like one of the below @@ -126,17 +126,17 @@ * strong {obj *|styled-text}: causes its text to stand out from the narrative, generally rendered as bold or a brighter color. * emphatic {obj !|styled-text}: indicates that its text should be spoken with emphasis, generally rendered as italics * custom style {span .|id|[$styled-text]}: applies a specially defined font style. for example, if you have defined [`caution] to mean "demibold italic underline", cortav will try to apply the proper weight and styling within the constraints of the current font to the span [$styled-text]. see the [>fonts-sty fonts section] for more information about this mechanism. * literal {obj `|styled-text}: indicates that its text is a reference to a literal sequence of characters or other discrete token. generally rendered in monospace -* variable {obj $|styled-text}: indicates that its text is a stand-in that will be replaced with what it names. generally rendered in italic monospace, ideally of a different color +* variable {obj $|styled-text}: indicates to the reader that its text is a placeholder, rather than a literal representation. generally rendered in italic monospace, ideally of a different color * underline {obj _|styled-text}: underlines the text. use sparingly on text intended for webpages -- underlined text [!is] distinct from links, but underlining non-links is still a violation of convention. * strikeout {obj ~|styled-text}: indicates that its text should be struck through or otherwise indicated for deletion * insertion {obj +|styled-text}: indicates that its text should be indicated as a new addition to the text body. ** consider using a macro definition [`\edit: [~[#1]][+[#2]]] to save typing if you are doing editing work * link \[>[!ref] [!styled-text]\]: produces a hyperlink or cross-reference denoted by [$ref], which may be either a URL specified with a reference or the name of an object like an image or section elsewhere in the document. the unicode characters [`→] and [`🔗] can also be used instead of [`>] to denote a link. -* footnote {span ^|ref|[$styled-text]}: annotates the text with a defined footnote. in interactive output media [`\[^citations.qtheo Quantum Theosophy: A Neophyte's Catechism]] will insert a link with the text [`Quantum Theosophy: A Neophyte's Catechism] that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found. +* footnote {span ^|ref|[$styled-text]}: annotates the text with a defined footnote. in interactive output media [`\[^citations.qtheo Quantum Theosophy: A Neophyte's Catechism\]] will insert a link with the text [`Quantum Theosophy: A Neophyte's Catechism] that, when clicked, causes a footnote to pop up on the screen. for static output media, the text will simply have a superscript integer after it denoting where the footnote is to be found. * superscript {obj '|[$styled-text]} * subscript {obj ,|[$styled-text]} * raw {obj \\ |[$raw-text]}: causes all characters within to be interpreted literally, without expansion. the only special characters are square brackets, which must have a matching closing bracket, and backslashes. * raw literal \[$\\[!raw-text]\]: shorthand for [\[$[\…]]] * macro [` \{[!name] [!arguments]}]: invokes a [>ex.mac macro], specified with a reference @@ -650,24 +650,26 @@ used files should return a table with the following members * [`macros]: an array of functions that return strings or arrays of strings when invoked. these will be injected into the global macro namespace. ###ts ts -the [*ts] extension allows documents to be marked up for basic classification constraints and automatically redacted. if you are seriously relying on ts for confidentiality, make damn sure you start the file with [$%[*requires] ts], so that rendering will fail with an error if the extension isn't supported. +the [*ts] extension allows documents to be marked up for basic classification constraints and automatically redacted. if you are seriously relying on [`ts] for confidentiality, make damn sure you start the file with [$%[*requires] ts], so that rendering will fail with an error if the extension isn't supported. -ts enables the directives: -* [`%[*ts] class [$scope level] ([$styled-text])]: indicates a classification level for either the while document (scope [$doc]) or the next section (scope [$sec]). if the ts level is below [$level], the section will be redacted or rendering will fail with an error, as appropriate. if styled-text is included, this will be treated as the name of the classification level. +[`ts] currently has no support for misinformation. + +[`ts] enables the directives: +* [`%[*ts] class [$scope level] ([$styled-text])]: indicates a classification level for either the whole document (scope [$doc]) or the next section (scope [$sec]). if the ts level is below [$level], the section will be redacted or rendering will fail with an error, as appropriate. if styled-text is included, this will be treated as the name of the classification level. * [`%[*ts] word [$scope word] ([$styled-text])]: indicates a codeword clearance that must be present for the text to render. if styled-text is present, this will be used to render the name of the codeword instead of [$word]. * [`%[*when] ts level [$level]] * [`%[*when] ts word [$word]] -ts enables the spans: -* [`\[🔒#[!level] [$styled-text]\]]: redacts the span if the security level is below that specified. -* [`\[🔒.[!word] [$styled-text]\]]: redacts the span if the specified codeword clearance is not enabled. +[`ts] enables the spans: +* [` \[🔒#[$level] [$styled-text]\]]: redacts the span if the security level is below that specified. +* [` \[🔒.[$word] [$styled-text]\]]: redacts the span if the specified codeword clearance is not enabled. (the padlock emoji is shorthand for [`%[*ts]].) -ts redacts spans securely; that is, they are simply replaced with an indicator that they have been redacted, without visually leaking the length of the redacted text. +[`ts] redacts spans securely; that is, they are simply replaced with an indicator that they have been redacted, without visually leaking the length of the redacted text. redacted sections are simply omitted. ~~~#ts-example example [cortav] ~~~ %ts word doc sorrowful-pines SORROWFUL PINES # intercept R1440 TCT S3 @@ -831,21 +833,21 @@ tengwar: https://en.wikipedia.org/wiki/Tengwar ###refimpl-switches switches [`cortav.lua] offers various switches to control its behavior. + long + short + function + -| [`\--out [!file]] :|:[`-o]:| sets the output file (default stdout) | -| [`\--log [!file]] :|:[`-l]:| sets the log file (default stderr) | -| [`\--define [!var] [!val]] :|:[`-d]:| sets the context variable [$var] to [$val] | -| [`\--mode-set [!mode]] :|:[`-y]:| activates the [>refimpl-mode mode] with ID [!mode] -| [`\--mode-clear [!mode]] :|:[`-n]:| disables the mode with ID [!mode] | -| [`\--mode [!id] [!val]] :|:[`-m]:| configures mode [!id] with the value [!val] | -| [`\--mode-set-weak [!mode]] :|:[`-Y]:| activates the [>refimpl-mode mode] with ID [!mode] if the source file does not specify otherwise -| [`\--mode-clear-weak [!mode]] :|:[`-N]:| disables the mode with ID [$mode] if the source file does not specify otherwise -| [`\--mode-weak [!id] [!val]] :|:[`-M]:| configures mode [$id] with the value [$val] if the source file does not specify otherwise -| [`\--help] :|:[`-h]:| display online help | -| [`\--version] :|:[`-V]:| display the interpreter version | +| [`--out [$file]] :|:[`-o]:| sets the output file (default stdout) | +| [`--log [$file]] :|:[`-l]:| sets the log file (default stderr) | +| [`--define [$var] [$val]] :|:[`-d]:| sets the context variable [$var] to [$val] | +| [`--mode-set [$mode]] :|:[`-y]:| activates the [>refimpl-mode mode] with ID [!mode] +| [`--mode-clear [$mode]] :|:[`-n]:| disables the mode with ID [!mode] | +| [`--mode [$id] [$val]] :|:[`-m]:| configures mode [$id] with the value [$val] | +| [`--mode-set-weak [$mode]] :|:[`-Y]:| activates the [>refimpl-mode mode] with ID [$mode] if the source file does not specify otherwise +| [`--mode-clear-weak [$mode]] :|:[`-N]:| disables the mode with ID [$mode] if the source file does not specify otherwise +| [`--mode-weak [$id] [$val]] :|:[`-M]:| configures mode [$id] with the value [$val] if the source file does not specify otherwise +| [`--help] :|:[`-h]:| display online help | +| [`--version] :|:[`-V]:| display the interpreter version | ###refimpl-mode modes most of [`cortav.lua]'s implementation-specific behavior is controlled by use of [!modes]. these are namespaced options which may have a boolean, string, or numeric value. boolean modes are set with the [`-y] [`-n] flags; other modes use the [`-m] flags. most modes are defined by the renderer backend. the following modes affect the behavior of the frontend: @@ -865,14 +867,16 @@ * string (css length) [`html:width] sets a maximum width for the body content in order to make the page more readable on large displays * number [`html:accent] applies an accent hue to the generated webpage. the hue is specified in degrees, e.g. [$-m html:accent 0] applies a red accent. * flag [`html:dark-on-light] uses dark-on-light styling, instead of the default light-on-dark * flag [`html:fossil-uv] outputs an HTML snippet suitable for use with the Fossil VCS webserver. this is intended to be used with the unversioned content mechanism to host rendered versions of documentation written in cortav that's stored in a Fossil repository. +* flag [`html:xhtml] generates syntactically-`valid' XHTML5 +* flag [`html:epub] generates XHTML5 suitable for use in an EPUB3 archive * number [`html:hue-spread] generates a color palette based on the supplied accent hue. the larger the value, the more the other colors diverge from the accent hue. * string [`html:link-css] generates a document linking to the named stylesheet * flag [`html:gen-styles] embeds appropriate CSS styles in the document (default on) -* flag [`html:snippet] produces a snippet of html instead of an entire web page. note that proper CSS scoping is not yet implemented (and can't be implemented hygienically since [$scoped] was removed 😢) +* flag [`html:snippet] produces a snippet of html instead of an entire web page. note that proper CSS scoping is not yet implemented (and can't be implemented hygienically since [`scoped] was removed 😢) * string [`html:title] specifies the webpage titlebar contents (normally autodetected from the document based on headings or directives) * string [`html:font] specifies the default font to use when rendering as a CSS font specification (e.g. [`-m html:font 'Alegreya, Junicode, Georgia, "Times New Roman"]) ~~~ $ cortav readme.ct --out readme.html \ @@ -1020,8 +1024,8 @@ sorcrep: https://c.hale.su/sorcery ### intent files there's currently no standard way to describe the intent and desired formatting of a document besides placing pragmata in the source file itself. this is extremely suboptimal, as when generating collections of documents, it's ideal to be able to keep all formatting information in one place. users should also be able to specify their own styling overrides that describe the way they prefer to read [`cortav] files, especially for uses like gemini or gopher integration. -at some point soon [`cortav] needs to address this by adding intent files that can be activated from outside the source file, such as with a command line flag or a configuration file setting. these will probably consist of lines that are interpreted as pragmata. in addition to the standard intent format however, individual implementations should feel free to provide their own ways to provide intent metadata; e.g. the reference implementation, which has a lua interpreter available, should be able to take a lua script that runs after the parse stage and generates . this will be particularly useful for the end-user who wishes to specify a particular format she likes reading her files in without forcing that format on everyone she sends the compiled document to, as it will be able to interrogate the document and make intelligent decisions about what pragmata to apply. +at some point soon [`cortav] needs to address this by adding intent files that can be activated from outside the source file, such as with a command line flag or a configuration file setting. these will probably consist of lines that are interpreted as pragmata. in addition to the standard intent format however, individual implementations should feel free to provide their own ways to provide intent metadata; e.g. the reference implementation, which has a lua interpreter available, should be able to take a lua script that runs after the parse stage and makes arbitrary alterations to the AST. this will be particularly useful for the end-user who wishes to specify a particular format she likes reading her files in without forcing that format on everyone she sends the compiled document to, as it will be able to interrogate the document and make intelligent decisions about what pragmata to apply. intent files should also be able to define [>rsrc resources], [>ctxvar context variables], and macros. Index: cortav.lua ================================================================== --- cortav.lua +++ cortav.lua @@ -115,10 +115,17 @@ insert = function(self, block) block.origin = self:clone() table.insert(self.sec.blocks,block) return block end; + init = function(ctx, doc, src) + ctx.line = 0 + ctx.doc = doc + ctx.doc.src = src + ctx.sec = doc:mksec() -- toplevel section + ctx.sec.origin = ctx:clone() + end; ref = function(self,id) if not id:find'%.' then local rid = self.sec.refs[id] if self.sec.refs[id] then return self.sec.refs[id], id, self.sec @@ -186,10 +193,16 @@ end; context_var = function(self, var, ctx, test) local fail = function(...) if test then return false end ctx:fail(...) + end + local scanParents = function(k) + for k,p in pairs(self.parents) do + local v = p:context_var(k, ctx, true) + if v ~= false then return v end + end end if startswith(var, 'cortav.') then local v = var:sub(8) if v == 'page' then if ctx.page then return tostring(ctx.page) @@ -219,27 +232,54 @@ return fail('undefined environment variable %s', v) end elseif self.stage.kind == 'render' and startswith(var, self.stage.format..'.') then -- TODO query the renderer somehow return fail('renderer %s does not implement variable %s', self.stage.format, var) + elseif startswith(var, 'super.') then + local sp = scanParents(var:sub(8)) + if sp == nil then + if test then return false else return '' end + else + return sp + end elseif self.vars[var] then return self.vars[var] else + local sp = scanParents(var) + if sp then return sp end if test then return false end return '' -- is this desirable behavior? end end; job = function(self, name, pred, ...) -- convenience func return self.docjob:fork(name, pred, ...) - end + end; + sub = function(self, ctx) + -- convenience function for single-inheritance structure + -- sets up a doc/ctx pair for a subdocument embedded in the source + -- of a gretaer document, pointing subdoc props to global tables/values + local newdoc = ct.doc.mk(self) + newdoc.meta = self.meta + newdoc.ext = self.ext + newdoc.enc = self.enc + newdoc.stage = self.stage + -- vars are handled through proper recursion across all parents and + -- are intentionally excluded here; subdocs can have their own vars + -- without losing access to parent vars + local nctx = ctx:clone() + nctx:init(newdoc, ctx.src) + nctx.line = ctx.line + return newdoc, nctx + end; }; - mk = function() return { + mk = function(...) return { sections = {}; secorder = {}; embed = {}; meta = {}; vars = {}; + parents = {...}; ext = { inhibit = {}; need = {}; use = {}; }; @@ -596,45 +636,47 @@ crit = crit; failthru = failthru; spans = spans; } end + end + local function rawcode(s, c) -- raw + local o = c:clone(); + local str = '' + for c, p in ss.str.each(c.doc.enc, s) do + local q = p:esc() + if q then + str = str .. q + p.next.byte = p.next.byte + #q + else + str = str .. c + end + end + return { + kind = 'format'; + style = 'literal'; + spans = {{ + kind = 'raw'; + spans = {str}; + origin = o; + }}; + origin = o; + } end ct.spanctls = { {seq = '!', parse = formatter 'emph'}; {seq = '*', parse = formatter 'strong'}; {seq = '~', parse = formatter 'strike'}; {seq = '+', parse = formatter 'insert'}; + {seq = '`\\', parse = rawcode}; + {seq = '\\\\', parse = rawcode}; {seq = '\\', parse = function(s, c) -- raw return { kind = 'raw'; spans = {s}; origin = c:clone(); } - end}; - {seq = '`\\', parse = function(s, c) -- raw - local o = c:clone(); - local str = '' - for c, p in ss.str.each(c.doc.enc, s) do - local q = p:esc() - if q then - str = str .. q - p.next.byte = p.next.byte + #q - else - str = str .. c - end - end - return { - kind = 'format'; - style = 'literal'; - spans = {{ - kind = 'raw'; - spans = {str}; - origin = o; - }}; - origin = o; - } end}; {seq = '`', parse = formatter 'literal'}; {seq = '$', parse = formatter 'variable'}; {seq = '^', parse = function(s,c) --footnotes local r, t = s:match '^([^%s]+)%s*(.-)$' @@ -986,11 +1028,12 @@ table.insert(last.lines, sp) j:hook('block_aside_attach', c, last, sp, l) j:hook('block_aside_line_insert', c, last, sp, l) end end}; - {pred = function(s,c) return s:match'^[*:]' end, fn = blockwrap(function(l,c) -- list + {pred = function(s,c) return s:match'^[*:]' end, + fn = blockwrap(function(l,c) -- list local stars = l:match '^([*:]+)' local depth = utf8.len(stars) local id, txt = l:sub(#stars+1):match '^(.-)%s*(.-)$' local ordered = stars:sub(#stars) == ':' if id == '' then id = nil end @@ -1075,10 +1118,31 @@ end elseif crit == '!' then c:fail('critical directive %s not supported',cmd) end end;}; + {pred = function(s) return s:match '^(>+)([^%s]*)%s*(.*)$' end, + fn = function(l,c,j,d) + local lvl,id,txt = l:match '^(>+)([^%s]*)%s*(.*)$' + lvl = utf8.len(lvl) + local last = d[#d] + local node + local ctx + if last and last.kind == 'quote' and (id == nil or id == '' or id == last.id) then + node = last + ctx = node.ctx + ctx.line = c.line -- is this enough?? + else + local doc + doc, ctx = c.doc:sub(c) + node = { kind = 'quote', doc = doc, ctx = ctx, id = id } + j:hook('block_insert', c, node, l) + table.insert(d, node) + end + + ct.parse_line(txt, ctx, ctx.sec.blocks) + end}; {seq = '~~~', fn = blockwrap(function(l,c,j) local extract = function(ptn, str) local start, stop = str:find(ptn) if not start then return nil, str end local ex = str:sub(start,stop) @@ -1169,11 +1233,12 @@ else newline = {l} end table.insert(ctx.mode.listing.lines, newline) job:hook('block_listing_newline',ctx,ctx.mode.listing,newline) end - else + elseif ctx.mode.kind == 'quote' then + else local mf = job:proc('modes', ctx.mode.kind) if not mf then ctx:fail('unimplemented syntax mode %s', ctx.mode.kind) end mf(job, ctx, l, dest) --NOTE: you are responsible for triggering the appropriate hooks if you insert anything! @@ -1192,11 +1257,11 @@ end if not tryseqs(ct.ctlseqs) then local found = false - for eb, ext, state in job:each('blocks') do + for eb, ext, state in job:each 'blocks' do if tryseqs(eb, state) then found = true break end end if not found then ctx:fail 'incomprehensible input line' @@ -1216,15 +1281,11 @@ function ct.parse(file, src, mode, setup) -- this object is threaded down through the parse tree -- and copied to store information like the origin of the -- element in the source code local ctx = ct.ctx.mk(src) - ctx.line = 0 - ctx.doc = ct.doc.mk() - ctx.doc.src = src - ctx.sec = ctx.doc:mksec() -- toplevel section - ctx.sec.origin = ctx:clone() + ctx:init(ct.doc.mk(), src) ctx.lang = mode['meta:lang'] if mode['parse:enc'] then local e = ss.str.enc[mode['parse:enc']] if not e then ct.exns.enc('requested encoding not supported',mode['parse:enc']):throw() Index: desk/cortav.xml ================================================================== --- desk/cortav.xml +++ desk/cortav.xml @@ -111,10 +111,11 @@ + @@ -131,10 +132,11 @@ + Index: ext/transmogrify.lua ================================================================== --- ext/transmogrify.lua +++ ext/transmogrify.lua @@ -39,16 +39,17 @@ }; } local quotes = { [ss.str.enc.utf8] = { - ['en'] = {'“', '”'; '‘', '’'}; - ['de'] = {'„', '“'; '‚', '‘'}; - ['sp'] = {'«', '»'; '‹', '›'}; - ['ja'] = {'「', '」'; '『', '』'}; - ['fr'] = {'« ', ' »'; '‹ ', ' ›'}; - [true] = {'“', '”'; '‘', '’'}; + -- 5 = elision char + ['en'] = {'“', '”'; '‘', '’'; '’'}; + ['de'] = {'„', '“'; '‚', '‘'; '’'}; + ['sp'] = {'«', '»'; '‹', '›'; "’"}; + ['ja'] = {'「', '」'; '『', '』'; "'"}; + ['fr'] = {'« ',' »'; '‹ ',' ›'; "’"}; + [true] = {'“', '”'; '‘', '’'; '’'}; }; } local function meddle(ctx, t) local pts = patterns[ctx.doc.enc] @@ -141,11 +142,11 @@ default = true; -- on unless inhibited slow = true; hook = { doc_meddle_ast = function(job) for n, sec in pairs(job.doc.secorder) do - if sec.kind=='ordinary' or sec.kind=='blockquote' + if sec.kind=='ordinary' or sec.kind=='quote' or sec.kind=='footnote' then for i, block in pairs(sec.blocks) do if type(block.spans) == 'table' then enterspan(block.origin, block.spans) elseif type(block.spans) == 'string' then Index: render/html.lua ================================================================== --- render/html.lua +++ render/html.lua @@ -15,10 +15,21 @@ local f = string.format local getSafeID = ct.tool.namespace() local footnotes = {} local footnotecount = 0 + + local cdata = function(...) return ... end + if opts.epub then + opts.xhtml = true + end + + if opts.xhtml then + cdata = function(s) + return '' + end + end local langsused = {} local langpairs = { lua = { color = 0x9377ff }; terra = { color = 0xff77c8 }; @@ -46,21 +57,21 @@ } ]]; list_ordered = [[]]; list_unordered = [[]]; footnote = [[ - div.footnote { + aside.footnote { font-family: 90%; grid-template-columns: 1em 1fr min-content; grid-template-rows: 1fr min-content; position: fixed; padding: 1em; background: @tone(0.03); margin:auto; } @media screen { - div.footnote { + aside.footnote { display: grid; left: 10em; right: 10em; max-width: calc(@width + 2em); max-height: 30vw; @@ -68,11 +79,11 @@ border: 1px solid black; transform: translateY(200%); transition: 0.4s; z-index: 100; } - div.footnote:target { + aside.footnote:target { transform: translateY(0%); } #cover { position: fixed; top: 0; @@ -84,27 +95,27 @@ opacity: 0%; transition: 1s; pointer-events: none; backdrop-filter: blur(0px); } - div.footnote:target ~ #cover { + aside.footnote:target ~ #cover { opacity: 100%; pointer-events: all; backdrop-filter: blur(5px); } } @media print { - div.footnote { + aside.footnote { display: grid; position: relative; } - div.footnote:first-of-type { + aside.footnote:first-of-type { border-top: 1px solid black; } } - div.footnote > a[href="#0"]{ + aside.footnote > a[href="#0"]{ grid-row: 2/3; grid-column: 3/4; display: block; text-align: center; padding: 0 0.3em; @@ -118,35 +129,35 @@ -ms-user-select: none; user-select: none; -webkit-user-drag: none; user-drag: none; } - div.footnote > a[href="#0"]:hover { + aside.footnote > a[href="#0"]:hover { background: @tone(0.3); color: @tone(2); } - div.footnote > a[href="#0"]:active { + aside.footnote > a[href="#0"]:active { background: @tone(0.05); color: @tone(0.4); } @media print { - div.footnote > a[href="#0"]{ + aside.footnote > a[href="#0"]{ display:none; } } - div.footnote > div.number { + aside.footnote > div.number { text-align:right; grid-row: 1/2; grid-column: 1/2; } - div.footnote > div.text { + aside.footnote > div.text { grid-row: 1/2; grid-column: 2/4; padding-left: 1em; overflow-y: auto; } - div.footnote > div.text > p:first-child { + aside.footnote > div.text > p:first-child { margin-top: 0; } ]]; header = [[ body { padding: 0 2.5em !important } @@ -163,11 +174,10 @@ margin-bottom: 0; } :is(h1,h2,h3,h4,h5,h6) + p { margin-top: 0.4em; } - ]]; headingAnchors = [[ :is(h1,h2,h3,h4,h5,h6) > a[href].anchor { text-decoration: none; font-size: 1.2em; @@ -253,10 +263,13 @@ margin: 0; margin-top: 0.6em; } section > aside p:first-child { margin: 0; + } + section aside + aside { + margin-top: 0.5em; } ]]; code = [[ code { display: inline-block; @@ -438,19 +451,27 @@ local runhook = function(h, ...) return renderJob:hook(h, render_state_handle, ...) end local tagproc do - local elt = function(t,attrs) - return f('<%s%s>', t, - attrs and ss.reduce(function(a,b) return a..b end, '', + local html_open = function(t,attrs) + if attrs then + return t .. ss.reduce(function(a,b) return a..b end, '', ss.map(function(v,k) if v == true then return ' '..k elseif v then return f(' %s="%s"', k, v) end - end, attrs)) or '') + end, attrs)) + else return t end + end + + local elt = function(t,attrs) + if opts.xhtml then + return f('<%s />', html_open(t,attrs)) + end + return f('<%s>', html_open(t,attrs)) end tagproc = { toTXT = { tag = function(t,a,v) return v end; @@ -470,23 +491,33 @@ catenate = function(...) return ... end; }; toHTML = { elt = elt; tag = function(t,attrs,body) - return f('%s%s', elt(t,attrs), body, t) + return f('<%s>%s', html_open(t,attrs), body, t) end; catenate = table.concat; }; } end local function getBaseRenderers(procs, span_renderers) local tag, elt, catenate = procs.tag, procs.elt, procs.catenate local htmlDoc = function(title, head, body) - return [[]] .. tag('html',nil, + local attrs + local header = [[]] + if opts['epub'] then + -- so cursed + attrs = { + xmlns = "http://www.w3.org/1999/xhtml"; + ['xmlns:epub'] = "http://www.idpf.org/2007/ops"; + } + header = [[]] + end + return header .. tag('html',attrs, tag('head', nil, - elt('meta',{charset = 'utf-8'}) .. + (opts.epub and '' or elt('meta',{charset = 'utf-8'})) .. (title and tag('title', nil, title) or '') .. (head or '')) .. tag('body', nil, body or '')) end @@ -647,11 +678,16 @@ elseif d.failthru then return htmlSpan(d.spans, b, s) end end function span_renderers.footnote(f,b,s) - addStyle 'footnote' + local linkattr = {} + if opts.epub then + linkattr['epub:type'] = 'noteref' + else + addStyle 'footnote' + end local source, sid, ssec = b.origin:ref(f.ref) local cnc = getSafeID(ssec) .. ' ' .. sid local fn if footnotes[cnc] then fn = footnotes[cnc] @@ -659,16 +695,19 @@ footnotecount = footnotecount + 1 fn = {num = footnotecount, origin = b.origin, fnid=cnc, source = source} fn.id = getSafeID(fn) footnotes[cnc] = fn end - return tag('a', {href='#'..fn.id}, htmlSpan(f.spans) .. + linkattr.href = '#'..fn.id + return tag('a', linkattr, htmlSpan(f.spans) .. tag('sup',nil, fn.num)) end return span_renderers end + + local astproc local function getBlockRenderers(procs, sr) local tag, elt, catenate = procs.tag, procs.elt, procs.catenate local null = function() return catenate{} end @@ -748,10 +787,35 @@ ['break'] = function() -- HACK -- lists need to be rewritten to work like asides return ''; end; } + + function block_renderers.quote(b,s) + local ir = {} + local toIR = block_renderers + for i, sec in ipairs(b.doc.secorder) do + local secnodes = {} + for i, bl in ipairs(sec.blocks) do + if toIR[bl.kind] then + table.insert(secnodes, toIR[bl.kind](bl,sec)) + end + end + if next(secnodes) then + if b.doc.secorder[2] then --#secs>1? + -- only wrap in a section if >1 section + table.insert(ir, tag('section', + {id = getSafeID(sec)}, + secnodes)) + else + ir = secnodes + end + end + end + return tag('blockquote', b.id and {id=getSafeID(b)} or {}, catenate(ir)) + end + return block_renderers; end local function getRenderers(procs) local span_renderers = getSpanRenderers(procs) @@ -758,11 +822,11 @@ local r = getBaseRenderers(procs,span_renderers) r.block_renderers = getBlockRenderers(procs, r) return r end - local astproc = { + astproc = { toHTML = getRenderers(tagproc.toHTML); toTXT = getRenderers(tagproc.toTXT); toIR = { }; } astproc.toIR.span_renderers = ss.clone(astproc.toHTML); @@ -778,19 +842,18 @@ -- bind to legacy names -- yikes this needs to be cleaned up so badly local ir = {} local dr = astproc.toHTML -- default renderers local plainr = astproc.toTXT - local irBlockRdrs = astproc.toIR.block_renderers; render_state_handle.ir = ir; local function renderBlocks(blocks, irs) for i, block in ipairs(blocks) do local rd - if irBlockRdrs[block.kind] then - rd = irBlockRdrs[block.kind](block,sec) + if astproc.toIR.block_renderers[block.kind] then + rd = astproc.toIR.block_renderers[block.kind](block,sec) else local rdr = renderJob:proc('render',block.kind,'html') if rdr then rd = rdr({ state = render_state_handle; @@ -816,10 +879,11 @@ table.insert(irs.nodes, rd) runhook('ir_section_node_insert', rd, irs, sec) end end end + runhook('ir_assemble', ir) for i, sec in ipairs(doc.secorder) do if doctitle == nil and sec.depth == 1 and sec.heading_node then doctitle = astproc.toTXT.htmlSpan(sec.heading_node.spans, sec.heading_node, sec) end @@ -828,33 +892,65 @@ if #(sec.blocks) > 0 then irs = {tag='section',attrs={id = getSafeID(sec)},nodes={}} runhook('ir_section_build', irs, sec) renderBlocks(sec.blocks, irs) end - elseif sec.kind == 'blockquote' then + elseif sec.kind == 'quote' then elseif sec.kind == 'listing' then elseif sec.kind == 'embed' then end if irs then table.insert(ir, irs) end end - for _, fn in pairs(footnotes) do - local tag = tagproc.toIR.tag - local body = {nodes={}} - local ftir = {} - for l in fn.source:gmatch('([^\n]*)') do - ct.parse_line(l, fn.origin, ftir) + do local fnsorted = {} + for _, fn in pairs(footnotes) do + fnsorted[fn.num] = fn + end + + for _, fn in ipairs(fnsorted) do + local tag = tagproc.toIR.tag + local body = {nodes={}} + local ftir = {} + for l in fn.source:gmatch('([^\n]*)') do + ct.parse_line(l, fn.origin, ftir) + end + renderBlocks(ftir,body) + local fattr = {id=fn.id} + if opts.epub then + ---UUUUUUGHHH + local npfx = string.format('(%u) ', fn.num) + if next(body.nodes) then + local n = body.nodes[1] + repeat + if n.nodes[1] then + if type(n.nodes[1]) == 'string' then + n.nodes[1] = npfx .. n.nodes[1] + break + end + n = n.nodes[1] + else + n.nodes[1] = {tag='p',attrs={},nodes={npfx}} + break + end + until false + + else + body.nodes[1] = {tag='p',attrs={},nodes={npfx}} + end + fattr['epub:type'] = 'footnote' + else + fattr.class = 'footnote' + end + local note = tag('aside', fattr, opts.epub and body.nodes or { + tag('div',{class='number'}, tostring(fn.num)), + tag('div',{class='text'}, body.nodes), + tag('a',{href='#0'},'⤫') + }) + table.insert(ir, note) end - renderBlocks(ftir,body) - local note = tag('div',{class='footnote',id=fn.id}, { - tag('div',{class='number'}, tostring(fn.num)), - tag('div',{class='text'}, body.nodes), - tag('a',{href='#0'},'⤫') - }) - table.insert(ir, note) end - if next(footnotes) then + if next(footnotes) and not opts.epub then table.insert(ir, tagproc.toIR.tag('div',{id='cover'},'')) end -- restructure passes runhook('ir_restructure_pre', ir) @@ -1014,11 +1110,11 @@ table.insert(styles, string.format([[body {padding:0 1em;margin:auto;max-width:%s}]], opts.width)) end if opts.accent then table.insert(styles, string.format(':root {--accent:%s}', opts.accent)) end - if opts.accent or (not opts['dark-on-light']) and (not opts['fossil-uv']) then + if not opts.epub and (opts.accent or (not opts['dark-on-light']) and (not opts['fossil-uv'])) then addStyle 'accent' end for _,k in pairs(stylesNeeded.order) do @@ -1033,11 +1129,11 @@ if type(css) ~= 'string' then ct.exns.mode('must be a string', 'html:link-css'):throw() end styletag = styletag .. tagproc.toHTML.elt('link',{rel='stylesheet',type='text/css',href=opts['link-css']}) end if next(styles) then if opts['gen-styles'] then - styletag = styletag .. tagproc.toHTML.tag('style',{type='text/css'},table.concat(styles)) + styletag = styletag .. tagproc.toHTML.tag('style',{type='text/css'},cdata(table.concat(styles))) end table.insert(head, styletag) end if opts['fossil-uv'] then