scml  scml

abstract

scml is a template language that can be used to generate html based on Scheme and S-expressions. this repo contains an implementation written in Chicken Scheme.

dependencies

scml is relatively lightweight. it depends only on Chicken Scheme, its stdlib, and the two small libraries lib/fail.scm and lib/lisp-macro.scm, included with this package. compilation will not require installation anything beyond the chicken package.

building

scml is a library. you can include it in your own projects with the stanza (include "scml.scm"). however, the library also comes with a trivial interpreter scmlc that can be used to transform *.sm templates into HTML files. to build it, cd into the directory that contains makefile and execute one of the following commands

make scmlc    # creates release version of the `scmlc` binary 
make debug    # creates debug version of the `scmlc` binary

scml also comes with a program demonstrating how it may be embedded, called embed.scm. it can be compiled with make embed.

using

there are two ways to use scml: as a binary that takes *.sm template and emits HTML files, or as a library inside a user-written program.

the bundled compiler scmlc reads from standard input and writes to standard out. we can run it on e.g. the bundled torture-test.sm file with the command $ cat torture-test.sm \| ./scmlc > torture-test.html, or using the makefile's generic %.html rule, make torture-test.html.

the library is described further on.

templates

a very simple template that just generates static HTML might look something like the following

; template.sm
((!doctype html))
(html (head (- meta (charset . "utf-8"))
            (title "goodbye world"))
      (body (p "this is paragraph no." 1)
            (p "this is a paragraph" (strong "with")
               ((span (style . "font-size: 110%")) "styled")
               (code "text"))))

this should be fairly straightforward to understand. there are three exceptions, and these both have to do with how scml handles attributes, which have no straightforward s-exp equivalent.

we'll start with the easier one. the form (- …) is used to create tags that do not have bodies. the <meta> tag is one such tag. attributes and values are supplied after the name of the tag consed together, e.g. (key . "value"). for instance, <input name="user" type="text"> translates to (- input (name . "user") (type . "text")).

note: the Chicken Scheme reader (the function that transforms text into s-expressions), along with many other Scheme readers, allows the use of brackets beyond mere parentheses. while it is less strictly portable, you can use your choice of brackets to make code more readable, perhaps setting off attribute lists with [ … ] and code blocks as {% … } {@ … } {= …}, or any other style that works best for you.

normal tags can take attributes too; in fact, the (- …) form is simply syntactic sugar for the full form. consider the HTML element <textarea name="desc">description</textarea> - we can express this in scml as ((textarea (name . "desc")) "description"). in other words, if the first term of a list is another list, the compiler interprets it as the tag followed by an attribute list.

"boolean" attributes can also be encoded this way. rather than using a cons pair, you can simply enter them into the attribute list as symbols. this enables us to write a <!doctype html> declaration using one of two constructs

((!doctype html)) ; no semantic sugar
(- !doctype html) ; with semantic sugar

embedding scheme

there are three special forms that allow us to embed scheme code that is evaluated when the scml is translated to html. the simplest form is (% …) which will execute arbitrary scheme code. this code can mutate the execution environment, but its output will be ignored. a good use for this form is to define functions or globals

(% (define page-title "index")
   (define path '(root list index))
   (define (emit-form target prompt)
          `((form (action . ,target) (method . "POST"))
            (div ,prompt "? " (- input (type . "text")
                                       (name . "field"))
                              (- input (type . "submit"))))))

the next form, (@ …) is designed to make it as easy as possibly to use scheme functions from within the template body. simply write a normal scheme function call but with the atom @ preceding the function name to dump its result into the page. the following example uses the defined functions to succinctly generate multiple simple forms. (note that (% …) can be implemented in terms of (@ …) as (@ begin … '()))

(- !doctype html)
(html (head (title "form example"))
      (body (p "here is a form")
            (@ emit-form "/cgi-bin/submit.pl"
               "to whom dost thou submit, peon")
            (p "and here is a second form")
            (@ emit-form "/post.php"
               "who shall suffer at the whipping post")))

note that when functions called with @ return lists, the compiler interprets them as scml and processes them before inserting the result into the HTML document. this process is fully recursive, so script nodes can return script nodes that return script nodes and they will all be evaluated. whether this behavior is useful, i have absolutely no clue, but it was easier to support it than not. strings that are returned are inserted as normal, and numbers are converted to strings. any other constructs (such as bare symbols) will trigger an "invalid node" error.

lastly, the form (= …) allows you to evaluate an arbitrary scheme expression and insert its result in the page.

(% (define page-title "index"))
(- !doctype html)
(html (head (title (= page-title)))
      (body (h1 "a demonstration of the power of Scheme!")
            (p "quake and tremble in mortal terror as you behold"
               "the awesome and insuperable arithmetic might of the"
               "programming language to end all programming"
               "languages! witness: for 2 + 2 =" (= (+ 2 2)))
            (p "have the petty kingdoms of Man e'er known such"
               "fearsome tidings? BEND THY KNEE IN WORSHIP")))

the ability to embed scripts exists mostly for the benefit of scmlc users or for processing runtime-defined content. if you are using scml.scm as a library in tooling of your own creation, you should use quasiquotation instead instead of script nodes and disable their evaluation unless you have a very good reason not to. in particular, if you are accepting "unsanitized" content from an untrusted source, you should disable script node evaluation. do not attempt to manually "sanitize" input under any circumstances.

a note on concatenation: scml attempts to intelligently insert spaces between all nodes. therefore the node (div "this" "is a" (span "sentence")) will be translated to <div>this is a <span>sentence</span>. spaces are omitted if there is ASCII punctuation at the boundary - for instance, (div "a." "b")<div>a.b</div> if you encounter an edge case and need to ensure that spaces are not inserted, break out to Scheme and use (string-append) yourself.

TODO: respect non-ASCII punctuation

TODO: add a switch to turn this off for CJK langs or maybe turn off space insertion if unicode characters are detected at border, thereby killing two birds with one stone? merge requests welcome

finally, scml includes two shorthand mechanisms for generating html. firstly, since css classes are so commonly used and the html syntax for them so absurdly cumbersome, scml uses a more css-like notation - you can simply put a . immediately after the tag name (with no spaces) followed by the class of that tag. if the tag name is omitted, it is inferred to be a <div>.

(html (head (- link (rel  . "stylesheet")
                    (href . "style.css")))
      (body (p "this is a normal paragraph")
            (p.big "this is a big paragraph")
            (.small "this is a small div")
            (p.|big cursed| "this is a big cursed paragraph")
            (.|small cursed| "this is a small cursed div"
                (span.big "with a big span inside it!"))))

embedding

you can write HTML-generation binaries of your own that rely on scml.scm for code-generation, allowing you to instead focus on high-level document structure.

(include "scml.scm") ; chicken scheme modules are an underdocumented
                     ; disaster zone so we're using transclusion instead,
                     ; at least until some kind soul submits a merge
                     ; request to make the code properly modular.

transcluding scml.scm into your project imports two entrypoints: (scml-compile) and (scml-read-and-compile). the latter takes no arguments and reads an scml document from the current input port and returns a string containing compiled HTML. the former takes 3 arguments.

(scml-compile *catch-fail* ; a continuation function called on error
              structure ; the scml structure to evaluate (a list of html nodes)
              permit-eval) ; #t to enable scripting nodes, #f to disable them

you can handle errors in one of two ways. when (scml-compile) encounters an error, it will call the function passed in as *catch-fail*. in order to abort operation, this should be a continuation created using (call-cc). the exception-handling library scml uses provides a very simple way to do this, suitable for very simple implementations or debug code: the macro (try).

(try processed-html (scml-compile structure permit-eval)
    (display processed-html))

if the function completes without error, its return value will be assigned to processed-html and the following block of code will be executed. otherwise, an error will be printed to stdout and the block will be skipped.

if you need more complex behavior, you can create a continuation yourself. you can detect an error by checking if the return value is the pair ( #f . <error> ) where <error> is a string describing the error that took place.

TODO: this exception mechanism is extremely primitive (read: stupid) and is slated for replacement as soon as either the author can properly wrap her head around continuations or someone else is kind enough to contribute the code.

license

the tools and example code in this repo are the exclusive property of alexis summer hale and are released under the Affero General Public License v3.