File parsav.md artifact eb5d145ae6 part of check-in 5b3a03ad34

parsav

parsav is a lightweight fediverse server

backends

parsav is designed to be storage-agnostic, and can draw data from multiple backends at a time. backends can be enabled or disabled at compile time to avoid unnecessary dependencies.

postgresql

dependencies

mongoose
json-c
mbedtls
postgresql backend:
- postgresql-libs

additional build-time dependencies are necessary if you are building directly from trunk, rather than from a release tarball that includes certain build artifacts which need to be embedded in the binary:

inkscape, for rendering out UI graphics
cwebp (libwebp package), for transforming inkscape PNGs to webp
sassc, for compiling the SCSS stylesheet into its final CSS

all builds require terra, which, unfortunately, requires installing an older version of llvm, v9 at the latest (which i develop parsav under). with any luck, your distro will be clever enough to package terra and its dependencies properly (it's trivial on nix, tho you'll need to tweak the terra expression to select a more recent llvm package); Arch Linux is one of those distros which is not so clever, and whose (AUR) terra package is totally broken. due to these unfortunate circumstances, terra is distributed not just in source form, but also in the the form of LLVM IR. distributions will also be made in the form of tarballed object code and assembly listings for various common platforms, currently including x86-32/64, arm7hf, aarch64, riscv, mips32/64, and ppc64/64le.

i've noticed that terra (at least with llvm9) seems to get a bit cantankerous and trigger llvm to fail with bizarre errors when you try to cross-compile parsav from x86-64 to any other platform, even x86-32. i don't know if this problem exists on other architectures or in what form, but as a workaround, the current cross-compile process consists of generating LLVM IR (ostensible for x86-64, though this is in reality an architecture-independent language), and then compiling that down to an object file with llc. this is an enormous hassle; hopefully the terra people will fix this eventually.

also note that, while parsav has a flag to build with ASAN, ASAN has proven unusable for most purposes as it routinely reports false positive buffer-heap-overflows. if you figure out how to defuckulate this, i will be overjoyed.

building

first, either install any missing dependencies as shared libraries, or build them as static libraries with the command make dep.$LIBRARY. as a shortcut, make dep will build all dependencies as static libraries. note that if the build system finds a static version of a library in the lib/ folder, it will use that instead of any system library. note that these commands require GNU make (it may be installed as gmake on your system), although this is a fairly soft dependency -- if you really need to build it on BSD make, you can probably translate it with a minute or so of work; you'll just have to do some of the various gmake functions' work manually. this may be worthwhile if you're packaging for a BSD.

postgresql-libs must be installed systemwide, as parsav does not currently provide for statically compiling and linking it

configuring

the parsav configuration is comprised of two components: the backends list and the config store. the backends list is a simple text file that tells parsav which data sources to draw from. the config store is a key-value store which contains the rest of the server's configuration, and is loaded from the backends. the configuration store can be spread across the backends; backends will be checked for configuration keys according to the order in which they are listed. changes to the configuration store affect parsav in real time; you only need to restart the server if you make a change to the backend list.

eventually, we'll add a command-line tool parsav-cfg to enable easy modification of the configuration store from the command line; for now, you'll need to modify the database by hand or use the online administration menu. the schema.sql file contains commands to prompt for various important values like the name of your administrative user.

by default, parsav looks for a file called backend.conf in the current directory when it is launched. you can override this default with the parsav_backend_file environment or with the -b/--backend-file flag. backend.conf lists one backend per line, in the form id type confstring. for instance, if you had two postgresql databases, you might write a backend file like

master   pgsql   host=localhost dbname=parsav
tweets   pgsql   host=420.69.dread.cloud dbname=content

the form the configuration string takes depends on the specific backend.

postgresql backend

currently, postgres needs to be configured manually before parsav can make use of it to store data. the first step is to create a database for parsav's use. once you've done that, you need to create the database schema with the command $ psql (-h $host) -d $database -f schema.sql. you'll be prompted for some crucial settings to install in the configuration store, such as the name of the relation you want to use for authentication (we'll call it parsav_auth from here on out).

parsav separates the storage of user credentials from the storage of other user data, in order to facilitate centralized user accounting. you don't need to take advantage of this feature, and if you don't want to, you can just create a parsav_auth table and have done. however, parsav_auth can also be a view, collecting a list of authorized users and their various credentials from whatever source you please.

parsav_auth has the following schema:

create table parsav_auth (
	aid bigint primary key,
	uid bigint,
	newname text,
	kind text not null,
	cred bytea not null,
	restrict text[],
	netmask cidr,
	blacklist bool
)

aid is a unique value identifying the authentication method. it must be deterministic -- values based on time of creation or a hash of uid+kind+cred are ideal. uid is the identifier of the user the row specifies credentials for. kind is a string indicating the credential type, and cred is the content of that credential.for the meaning of these fields and use of this structure, see authentication below.

authentication

in the most basic case, an authentication record would be something like {uid = 123, kind = "pw-sha512", cred = "12bf90…a10e"}. but parsav is not restricted to username-password authentication, and in addition to various hashing styles, it also will support more esoteric forms of authentcation. any individual user can have as many auth rows as she likes. there is also a restrict field, which is normally null, but can be specified in order to restrict a particular credential to certain operations, such as posting tweets or updating a bio. blacklist indicates that any attempt to authenticate that matches this row will be denied, regardless of whether it matches other rows. if netmask is present, this authentication will only succeed if it comes from the specified IP mask.

uid can also be 0 (not null, which matches any user!), indicating that there is not yet a record in parsav_actors for this account. if this is the case, name must contain the handle of the account to be created when someone attempts to log in with this credential. whether name is used in the authentication process depends on whether the authentication method accepts a username. all rows with the same uid must have the same name.

below is a full list of authentication types we intend to support. a checked box indicates the scheme has been implemented.

☑ pw-sha{512,384,256,224}: an ordinary password, hashed with the appropriate algorithm
☐ pw-{sha1,md5,clear} (insecure, must be manually enabled at compile time with the config variable parsav_let_me_be_a_dumbass="i know what i'm doing")
☐ pw-pbkdf2-hmac-sha{…}: a password hashed with the Password-Based Key Derivation Function 2 instead of plain SHA2
☐ api-digest-sha{…}: a value that can be hashed with the current epoch to derive a temporary access key without logging in. these are used for API calls, sent in the header X-API-Key.
☐ otp-time-sha1: a TOTP PSK: the first two bytes represent the step, the third byte the OTP length, and the remaining ten bytes the secret key
☐ tls-cert-fp: a fingerprint of a client certificate
☐ tls-cert-ca: a value of the form fp/key=value where a client certificate with the property key=value (e.g. uid=cyberlord19) signed by a certificate authority matching the given fingerprint fp can authenticate the user
☐ challenge-rsa-sha256: an RSA public key. the user is presented with a challenge and must sign it with the corresponding private key using SHA256.
☐ challenge-ecc-sha256: a Curve25519 public key. the user is presented with a challenge and must sign it with the corresponding private key using SHA256.
☐ challenge-ecc448-sha256: a Curve448 public key. the user is presented with a challenge and must sign it with the corresponding private key using SHA256.
☑ trust: authentication always succeeds. only use in combination with netmask!!!

license

parsav is released under the terms of the EUPL v1.2. copies of this license are included in the repository. dependencies are produced

future direction

parsav needs more storage backends, as it currently supports only postgres. some possibilities, in order of priority, are:

plain text/filesystem storage
lmdb
sqlite3
generic odbc
lua
ldap?? possibly just for users
cdb (for static content, maybe?)
mariadb/mysql
the various nosql horrors, e.g. redis, mongo, and so on

parsav parsav.md at [5b3a03ad34]