Functions & operators

Functions build expressions — the computed parts of a rule: filters, unifications with =, and the values you write back with :put. This page is the complete catalog. All functions except those that extract the current time and those having names starting with rand_ are deterministic.

Most of the catalog composes inside a single rule body. The query below turns raw rows from an agent's memory store into a readable digest: it scores each memory against a cue vector, blends in the stored importance, tags storage-related rows with a regex, and renders the unix timestamp as a date. (memory holds a text, an importance, a timestamp at, and a 4-dimensional embedding v; the same example relation appears throughout these docs.)

?[when, id, tag, score] := *memory{ id, text, importance, at, v },
    sim = 1 - cos_dist(v, vec([0.7, 0.1, 0.6, 0.1])),
    sim > 0.5,
    score = round(100 * (0.7 * sim + 0.3 * importance)) / 100,
    tag = if(regex_matches(text, '(?i)compaction|sst'), 'storage', 'other'),
    when = slice_string(format_timestamp(at), 0, 10)
:order -score

when         id    tag       score
2025-07-04   m4    storage   0.94
2025-07-05   m5    storage   0.93
2025-07-02   m2    other     0.89
2025-07-03   m3    storage   0.89
2025-07-08   m8    other     0.64

Non-functions

Functions take expressions as arguments, evaluate each argument in turn, and then evaluate their implementation to produce a value usable in an expression. Before the catalog, here are the constructs that look like functions but are not.

These are language constructs that return Horn clauses instead of expressions:

var = expr unifies expr with var. Different from expr1 == expr2.
not clause negates a Horn clause clause. Different from !expr or negate(expr).
clause1 or clause2 connects two Horn-clauses by disjunction. Different from or(expr1, expr2).
clause1, clause2 connects two Horn-clauses by conjunction. There is no clause-level and keyword: the comma is the conjunction, and and exists only as the expression function and(...) / the && operator.

or binds more tightly than ,, so a or b, c means (a or b), c:

?[x] := x = 1 or x = 2, x = 2

[2]

These are constructs that return expressions:

if(a, b, c) evaluates a, and if the result is true, evaluates b and returns its value, otherwise evaluates c and returns its value. a must evaluate to a boolean. Only the selected branch is evaluated.
if(a, b) same as if(a, b, null).
cond(a1, b1, a2, b2, ...) evaluates a1, if the result is true, returns the value of b1, otherwise continues with a2 and b2. An even number of arguments must be given and the as must evaluate to booleans. If all as are false, null is returned. If you want a catch-all clause at the end, put true as the condition.

Operators representing functions

Some functions have equivalent operator forms, which are easier to type and perhaps more familiar. First the binary operators:

a && b is the same as and(a, b)
a || b is the same as or(a, b)
a ^ b is the same as pow(a, b)
a ++ b is the same as concat(a, b)
a + b is the same as add(a, b)
a - b is the same as sub(a, b)
a * b is the same as mul(a, b)
a / b is the same as div(a, b)
a % b is the same as mod(a, b)
a >= b is the same as ge(a, b)
a <= b is the same as le(a, b)
a > b is the same as gt(a, b)
a < b is the same as lt(a, b)
a == b is the same as eq(a, b)
a != b is the same as neq(a, b)
a ~ b is the same as coalesce(a, b)
a -> b is the same as maybe_get(a, b)

These operators have precedence as follows (the earlier rows bind more tightly, and within the same row operators have equal binding power):

->
~
^
*, /
+, -, ++
%
==, !=
>=, <=, >, <
&&
||

With the exception of ^, all binary operators are left associative: a / b / c is the same as (a / b) / c. ^ is right associative: a ^ b ^ c is the same as a ^ (b ^ c).

The unary operators are:

-a is the same as minus(a)
!a is the same as negate(a)

Function applications using parentheses bind the tightest, followed by the unary operators, then the binary operators — with one exception: -> binds more tightly than the unary operators, so -j->0 is -(j->0).

Caution

Two rows differ from C-family languages: % binds more loosely than + and -, and == / != bind more tightly than the order comparisons. Parenthesize when in doubt.

?[a, b, c] := a = 2 + 3 % 4, b = 2 ^ 3 ^ 2, c = -2 ^ 2

[1, 512.0, 4.0]

2 + 3 % 4 is (2 + 3) % 4; 2 ^ 3 ^ 2 is 2 ^ 9; and -2 ^ 2 is (-2) ^ 2, since unary minus binds more tightly than ^.

Equality and comparisons

The filters of most queries end up here. The one rule to internalize: == compares values (numerically across Int and Float), while the four order comparisons require both sides to come from the same value family and raise an error otherwise.

`eq(x, y)`

Equality comparison. The operator form is x == y. Mixed Int/Float arguments are compared numerically, so 1 == 1.0 is true; if the two arguments are otherwise of different types, the result is false.

`neq(x, y)`

Inequality comparison. The operator form is x != y. Mixed Int/Float arguments are compared numerically; if the two arguments are otherwise of different types, the result is true.

?[x, y] := x = (1 == 1.0), y = ('one' == 1)

[true, false]

Caution

== compares numerically, but stored keys and joins compare structurally: a relation can hold both 1 and 1.0 as distinct keys, and a join will not unify them. See the key warning in Types.

`gt(x, y)`

Equivalent to x > y.

`ge(x, y)`

Equivalent to x >= y.

`lt(x, y)`

Equivalent to x < y.

`le(x, y)`

Equivalent to x <= y.

Note

The four comparison operators are only defined between two values from the same family — Null, Bool, Number, String, Bytes, or List. Comparing 1 < 'a' raises an error, and so does comparing two Uuid or two Vector values. Integers and floats are both Number. See Equality and ordering.

`max(x, ...)`

Returns the maximum of the arguments. Can only be applied to numbers.

`min(x, ...)`

Returns the minimum of the arguments. Can only be applied to numbers.

Boolean functions

For combining conditions inside an expression — as opposed to the clause connectives ,/or/not, which operate on Horn clauses.

`and(...)`

Variadic conjunction. For binary arguments it is equivalent to x && y.

`or(...)`

Variadic disjunction. For binary arguments it is equivalent to x || y.

`negate(x)`

Negation. Equivalent to !x.

`assert(x, ...)`

Returns true if x is true, otherwise raises an error containing all its arguments as the error message. Useful for turning a should-never-happen condition into a loud failure instead of silently dropped rows.

Mathematics

The arithmetic for computing scores in-query: decayed importance, log-scaled counts, blended rankings, all next to the data instead of in the host language.

?[a, b, c, d] := a = round(-0.5), b = floor(-0.5), c = mod(-7, 3), d = 2 ^ 10

[-1.0, -1.0, -1, 1024.0]

`add(...)`

Variadic addition. The binary version is the same as x + y.

`sub(x, y)`

Equivalent to x - y.

`mul(...)`

Variadic multiplication. The binary version is the same as x * y.

`div(x, y)`

Equivalent to x / y.

`minus(x)`

Equivalent to -x.

`pow(x, y)`

Raises x to the power of y. Equivalent to x ^ y. Always returns a floating-point number.

`sqrt(x)`

Returns the square root of x.

`mod(x, y)`

Returns the remainder when x is divided by y. Arguments can be floats. The returned value has the same sign as x. Equivalent to x % y. An integer divisor of zero raises an error.

`abs(x)`

Returns the absolute value.

`signum(x)`

Returns 1, 0 or -1, whichever has the same sign as the argument, e.g. signum(to_float('NEG_INF')) == -1, signum(0.0) == 0, but signum(-0.0) == -1. The result is an Int even for float arguments; only NAN input returns the float NAN.

`floor(x)`

Returns the floor of x.

`ceil(x)`

Returns the ceiling of x.

`round(x)`

Returns the nearest integer to the argument (represented as Float if the argument itself is a Float). Rounds halfway cases away from zero. E.g. round(0.5) == 1.0, round(-0.5) == -1.0, round(1.4) == 1.0.

`exp(x)`

Returns the exponential of the argument, natural base.

`exp2(x)`

Returns the exponential base 2 of the argument. Always returns a float.

`ln(x)`

Returns the natural logarithm.

`log2(x)`

Returns the logarithm base 2.

`log10(x)`

Returns the logarithm base 10.

`sin(x)`

The sine trigonometric function.

`cos(x)`

The cosine trigonometric function.

`tan(x)`

The tangent trigonometric function.

`asin(x)`

The inverse sine.

`acos(x)`

The inverse cosine.

`atan(x)`

The inverse tangent.

`atan2(x, y)`

The inverse tangent atan2 by passing x and y separately.

`sinh(x)`

The hyperbolic sine.

`cosh(x)`

The hyperbolic cosine.

`tanh(x)`

The hyperbolic tangent.

`asinh(x)`

The inverse hyperbolic sine.

`acosh(x)`

The inverse hyperbolic cosine.

`atanh(x)`

The inverse hyperbolic tangent.

`deg_to_rad(x)`

Converts degrees to radians.

`rad_to_deg(x)`

Converts radians to degrees.

`haversine(a_lat, a_lon, b_lat, b_lon)`

Computes with the haversine formula the angle measured in radians between two points a and b on a sphere specified by their latitudes and longitudes. The inputs are in radians. You probably want the next function when you are dealing with maps, since most maps measure angles in degrees instead of radians.

`haversine_deg_input(a_lat, a_lon, b_lat, b_lon)`

Same as the previous function, but the inputs are in degrees instead of radians. The return value is still in radians.

If you want the approximate distance measured on the surface of the earth instead of the angle between two points, multiply the result by the radius of the earth, which is about 6371 kilometres, 3959 miles, or 3440 nautical miles.

Note

The haversine formula, when applied to the surface of the earth, which is not a perfect sphere, can result in an error of less than one percent.

Vector functions

For constructing embeddings and computing distances directly in expressions — handy for exact reranking of a small candidate set, or for one-off similarity checks without an index. For approximate search over a large relation, build an HNSW index instead (see Proximity searches).

The mathematical functions that operate on floats also take vectors as arguments, applying the operation element-wise.

?[id, dist] := *memory{ id, v },
    dist = round(1000 * cos_dist(v, vec([0.7, 0.1, 0.6, 0.1]))) / 1000
:order dist
:limit 3

id    dist
m4    0.003
m3    0.023
m5    0.029

`vec(l, type?)`

Takes a list of numbers and returns a vector. Defaults to 32-bit float vectors; pass 'F64' (or 'Double') as the second argument for 64-bit vectors, and 'F32' (or 'Float') is accepted for the default.

Also accepts an existing vector (converting between element types as requested) and a base64-encoded string of the raw element bytes in native byte order (little-endian on all supported platforms).

`rand_vec(n, type?)`

Returns a vector of n random numbers between 0 and 1.

Defaults to 32-bit float vectors. If you want to use 64-bit float vectors, pass 'F64' as the second argument.

`l2_normalize(v)`

Takes a vector and returns a vector with the same direction but length 1, normalized using the L2 norm.

?[n] := n = l2_normalize(vec([3.0, 4.0], 'F64'))

[[0.6, 0.8]]

`l2_dist(u, v)`

Takes two vectors and returns the distance between them, using squared L2 norm: d = sum((ui-vi)^2).

`ip_dist(u, v)`

Takes two vectors and returns the distance between them, using inner product: d = 1 - sum(ui*vi).

`cos_dist(u, v)`

Takes two vectors and returns the distance between them, using cosine distance: d = 1 - sum(ui*vi) / (sqrt(sum(ui^2)) * sqrt(sum(vi^2))).

JSON functions

For columns that hold schemaless payloads — tool outputs, API responses, per-row metadata — queried without unpacking them into typed columns first. The -> operator makes path access read naturally:

?[name, lang, tags] := profile = parse_json('{"name": "Maya", "prefs": {"lang": "rust"}}'),
    name = profile->'name',
    lang = get(profile, ['prefs', 'lang']),
    tags = dump_json(profile->'tags' ~ json([]))

["Maya", "rust", "[]"]

`json(x)`

Converts any value to a Json value. This function is idempotent and never fails.

`is_json(x)`

Returns true if the argument is a Json value, false otherwise.

`json_object(k1, v1, ...)`

Converts a list of key-value pairs to a Json object.

`dump_json(x)`

Converts a Json value to its string representation.

`parse_json(x)`

Parses a string to a Json value.

`get(json, idx, default?)`

Returns the element at index idx in the Json json.

idx may be a string (for indexing objects), a number (for indexing arrays), or a list of strings and numbers (for indexing deep structures).

Raises an error if the requested element cannot be found, unless default is specified, in which case default is returned.

Leaf values (numbers, strings, booleans, null) are returned as plain scalars; nested arrays and objects are returned as Json values.

`maybe_get(json, idx)`

Returns the element at index idx in the Json json. Same as get(json, idx, null). The shorthand is json->idx.

`set_json_path(json, path, value)`

Sets the value at the given path in the given Json value. The path is a list of keys of strings (for indexing objects) or numbers (for indexing arrays). The value is converted to Json if it is not already a Json value.

`remove_json_path(json, path)`

Removes the value at the given path in the given Json value. The path is a list of keys of strings (for indexing objects) or numbers (for indexing arrays).

`json_to_scalar(x)`

Converts a Json value to a scalar value if it is a null, boolean, number or string, and returns the argument unchanged otherwise.

`concat(x, y, ...)`

Concatenates (deep-merges) Json values. It is equivalent to the operator form x ++ y ++ ....

The concatenation of two Json arrays is the concatenation of the two arrays. The concatenation of two Json objects is the deep-merge of the two objects, meaning that their key-value pairs are combined, with any pairs that appear in both left and right having their values deep-merged. For all other cases, the right value wins.

?[merged] := merged = dump_json(parse_json('{"a": 1, "b": {"x": 1}}') ++ parse_json('{"b": {"y": 2}}'))

["{\"a\":1,\"b\":{\"x\":1,\"y\":2}}"]

String functions

Normalization, trimming, and slicing: the plumbing of clean keys and readable output. For substring search across a whole relation, a full-text index scales better than filtering with these.

`length(str)`

Returns the number of Unicode characters in the string.

Can also be applied to a list, a byte array, or a vector.

Caution

length(str) does not return the number of bytes of the string representation. Also, what is returned depends on the normalization of the string. So if such details are important, apply unicode_normalize before length.

`concat(x, ...)`

Concatenates strings. Equivalent to x ++ y in the binary case.

Can also be applied to lists.

`str_includes(x, y)`

Returns true if x contains the substring y, false otherwise.

`lowercase(x)`

Converts to lowercase. Supports Unicode.

`uppercase(x)`

Converts to uppercase. Supports Unicode.

`trim(x)`

Removes whitespace from both ends of the string.

`trim_start(x)`

Removes whitespace from the start of the string.

`trim_end(x)`

Removes whitespace from the end of the string.

`starts_with(x, y)`

Tests if x starts with y.

Note

starts_with(var, str) is preferred over equivalent (e.g. regex) conditions, since the compiler may more easily compile the clause into a range scan.

`ends_with(x, y)`

Tests if x ends with y.

`unicode_normalize(str, norm)`

Converts str to the normalization specified by norm. The valid values of norm are 'nfc', 'nfd', 'nfkc' and 'nfkd'.

`slice_string(str, start, end)`

Returns the substring between character index start (inclusive) and end (exclusive), counted in Unicode characters. Both indices must be non-negative and start <= end — unlike slice for lists, negative indices are not accepted.

?[date] := date = slice_string(format_timestamp(1751587200), 0, 10)

["2025-07-04"]

`chars(str)`

Returns Unicode characters of the string as a list of substrings.

`from_substrings(list)`

Combines the strings in list into a big string. In a sense, it is the inverse function of chars.

Note

For a plain character-range substring, use slice_string. For anything more involved — indexing, filtering, reordering — convert the string to a list with chars, manipulate the list, and recombine with from_substrings.

`t2s(str)`

Converts Traditional Chinese to Simplified Chinese, e.g. t2s('憂鬱的臺灣烏龜') == '忧郁的台湾乌龟'. Non-string arguments are returned unchanged.

List functions

Lists are the general-purpose composite value, and this is the largest toolbox: construction, access, reshaping, and set operations, all usable inside a single expression.

?[xs, top, pairs, u] := xs = int_range(1, 6),
    top = last(xs),
    pairs = chunks(xs, 2),
    u = union([1, 2], [2, 3])

[[1, 2, 3, 4, 5], 5, [[1, 2], [3, 4], [5]], [1, 2, 3]]

`list(x, ...)`

Constructs a list from its arguments, e.g. list(1, 2, 3). Equivalent to the literal form [1, 2, 3].

`int_range(end)`, `int_range(start, end)`, `int_range(start, end, step)`

Returns the list of integers in the half-open range [start, end), defaulting to start = 0 and step = 1: int_range(5) == [0, 1, 2, 3, 4], int_range(1, 6) == [1, 2, 3, 4, 5]. A negative step counts down from start while greater than end.

`is_in(el, list)`

Tests the membership of an element in a list.

`first(l)`

Extracts the first element of the list. Returns null if given an empty list.

`last(l)`

Extracts the last element of the list. Returns null if given an empty list.

`get(l, n, default?)`

Returns the element at index n in the list l. Raises an error if the access is out of bounds, unless default is specified, in which case default is returned. Indices start with 0.

`maybe_get(l, n)`

Returns the element at index n in the list l. Same as get(l, n, null). The shorthand is l->n.

`length(list)`

Returns the length of the list.

Can also be applied to a string, a byte array, or a vector.

`slice(l, start, end)`

Returns the slice of list between the index start (inclusive) and end (exclusive). Negative numbers may be used, which is interpreted as counting from the end of the list. E.g. slice([1, 2, 3, 4], 1, 3) == [2, 3], slice([1, 2, 3, 4], 1, -1) == [2, 3].

`concat(x, ...)`

Concatenates lists. The binary case is equivalent to x ++ y.

Can also be applied to strings.

`prepend(l, x)`

Prepends x to l.

`append(l, x)`

Appends x to l.

`reverse(l)`

Reverses the list.

`sorted(l)`

Sorts the list and returns the sorted copy.

`chunks(l, n)`

Splits the list l into chunks of n, e.g. chunks([1, 2, 3, 4, 5], 2) == [[1, 2], [3, 4], [5]].

`chunks_exact(l, n)`

Splits the list l into chunks of n, discarding any trailing elements, e.g. chunks_exact([1, 2, 3, 4, 5], 2) == [[1, 2], [3, 4]].

`windows(l, n)`

Splits the list l into overlapping windows of length n, e.g. windows([1, 2, 3, 4, 5], 3) == [[1, 2, 3], [2, 3, 4], [3, 4, 5]].

`union(x, y, ...)`

Computes the set-theoretic union of all the list arguments.

`intersection(x, y, ...)`

Computes the set-theoretic intersection of all the list arguments.

`difference(x, y, ...)`

Computes the set-theoretic difference of the first argument with respect to the rest.

Binary functions

For byte arrays: bitwise operations, boolean bitsets, and base64 transport.

`length(bytes)`

Returns the length of the byte array.

Can also be applied to a list, a string, or a vector.

`bit_and(x, y)`

Calculates the bitwise and. The two byte arrays must have the same lengths.

`bit_or(x, y)`

Calculates the bitwise or. The two byte arrays must have the same lengths.

`bit_not(x)`

Calculates the bitwise not.

`bit_xor(x, y)`

Calculates the bitwise xor. The two byte arrays must have the same lengths.

`pack_bits([...])`

Packs a list of booleans into a byte array; if the list is not divisible by 8, it is padded with false.

`unpack_bits(x)`

Unpacks a byte array into a list of booleans.

`encode_base64(b)`

Encodes the byte array b into the Base64-encoded string.

Note

encode_base64 is automatically applied when output to JSON since JSON cannot represent bytes natively.

`decode_base64(str)`

Tries to decode the str as a Base64-encoded byte array.

Type checking and conversions

Guards and coercions for mixed-type data: checking what a value is before using it, and converting at the boundaries where typed columns meet loosely typed input.

`coalesce(x, ...)`

Returns the first non-null value; coalesce(x, y) is equivalent to x ~ y.

`to_string(x)`

Converts x to a string: the argument is unchanged if it is already a string, a Json string converts to its unquoted content, and any other value converts to its JSON string representation.

`to_float(x)`

Tries to convert x to a float. Conversion from numbers always succeeds. Conversion from strings has the following special cases in addition to the usual string representation:

INF is converted to infinity;
NEG_INF is converted to negative infinity;
NAN is converted to NAN (but don't compare NAN by equality, use is_nan instead);
PI is converted to pi (3.14159...);
E is converted to the base of natural logarithms, or Euler's constant (2.71828...).

Converts null and false to 0.0, true to 1.0.

`to_int(x)`

Converts to an integer. Floats are truncated toward zero (to_int(-1.7) == -1); strings are parsed; null converts to 0 and booleans to 0 or 1. If x is a validity, extracts its timestamp as an integer (in microseconds since the UNIX epoch).

`to_unity(x)`

Tries to convert x to 0 or 1: null, false, 0, 0.0, "", [], and the empty bytes are converted to 0, and everything else is converted to 1.

`to_bool(x)`

Tries to convert x to a boolean. The following are converted to false, and everything else is converted to true:

null
false
0, 0.0
"" (empty string)
the empty byte array
the nil UUID (all zeros)
[] (the empty list)
any validity that is a retraction

`to_uuid(x)`

Tries to convert x to a UUID. The input must either be a hyphenated UUID string representation or already a UUID for it to succeed.

`uuid_timestamp(x)`

Extracts the embedded timestamp from a UUID, as seconds since the UNIX epoch. UUID versions that carry a timestamp (1, 6 and 7) decode; versions without one (such as the random version 4) return null. If x is not a UUID, an error is raised.

`is_null(x)`

Checks for null.

`is_int(x)`

Checks for integers.

`is_float(x)`

Checks for floats.

`is_finite(x)`

Returns true if x is an integer or a finite float.

`is_infinite(x)`

Returns true if x is infinity or negative infinity.

`is_nan(x)`

Returns true if x is the special float NAN. Returns false when the argument is not of number type.

`is_num(x)`

Checks for numbers.

`is_bytes(x)`

Checks for bytes.

`is_list(x)`

Checks for lists.

`is_vec(x)`

Checks for vectors.

`is_string(x)`

Checks for strings.

`is_uuid(x)`

Checks for UUIDs.

Random functions

Sampling and identifier generation. These are the non-deterministic exceptions noted at the top of the page.

`rand_float()`

Generates a float in the half-open interval [0, 1), sampled uniformly.

`rand_bernoulli(p)`

Generates a boolean with probability p of being true. p must be between 0 and 1, inclusive.

`rand_int(lower, upper)`

Generates an integer within the given bounds, both bounds are inclusive.

`rand_choose(list)`

Randomly chooses an element from list and returns it. If the list is empty, it returns null.

`rand_uuid_v1()`

Generates a random UUID, version 1 (random bits plus timestamp). The resolution of the timestamp part is much coarser on WASM targets than the others.

`rand_uuid_v4()`

Generates a random UUID, version 4 (completely random bits).

`rand_vec(n, type?)`

Generates a vector of n random elements. If type is not given, it defaults to F32.

`rand_ulid()`

Returns a fresh ULID as a 26-character Crockford-base32 string: a 48-bit millisecond timestamp followed by 80 random bits. Because the timestamp is the high-order component, ULIDs sort lexicographically in creation order — ideal keys for append-only streams read back by recency.

`ulid_timestamp(ulid)`

Extracts the embedded timestamp from a ULID string, as an integer count of milliseconds since the UNIX epoch. Lowercase input and Crockford's confusable aliases (O→0, I/L→1) are accepted; a malformed or non-canonical ULID raises an error rather than decoding to a wrong timestamp.

?[id, ms] := id = rand_ulid(), ms = ulid_timestamp(id)

["01KX9ED2CAJTBGFC27NFG5P9DR", 1783802268042]

Your values will differ: the id embeds the wall clock at the moment of the call.

mnestic

rand_ulid() and ulid_timestamp() are mnestic additions (0.8.0). See ULID identifiers for why sortable keys suit agentic-memory workloads.

Regex functions

Lightweight pattern matching and extraction inside expressions: pulling a number out of a sentence, validating a shape, rewriting a string. For ranked search over prose, use a full-text index instead.

?[id, num] := *memory{ id, text },
    regex_matches(text, '[0-9]+'),
    num = to_int(regex_extract_first(text, '[0-9]+'))

`regex_matches(x, reg)`

Tests if x matches the regular expression reg.

`regex_replace(x, reg, y)`

Replaces the first occurrence of the pattern reg in x with y.

`regex_replace_all(x, reg, y)`

Replaces all occurrences of the pattern reg in x with y, e.g. regex_replace_all(lowercase('Postgres Connector'), '[^a-z0-9]+', '-') == 'postgres-connector'.

`regex_extract(x, reg)`

Extracts all occurrences of the pattern reg in x and returns them in a list.

`regex_extract_first(x, reg)`

Extracts the first occurrence of the pattern reg in x and returns it. If none is found, returns null.

Regex syntax

Matching one character:

.             any character except new line
\d            digit (\p{Nd})
\D            not digit
\pN           One-letter name Unicode character class
\p{Greek}     Unicode character class (general category or script)
\PN           Negated one-letter name Unicode character class
\P{Greek}     negated Unicode character class (general category or script)

Character classes:

[xyz]         A character class matching either x, y or z (union).
[^xyz]        A character class matching any character except x, y and z.
[a-z]         A character class matching any character in range a-z.
[[:alpha:]]   ASCII character class ([A-Za-z])
[[:^alpha:]]  Negated ASCII character class ([^A-Za-z])
[x[^xyz]]     Nested/grouping character class (matching any character except y and z)
[a-y&&xyz]    Intersection (matching x or y)
[0-9&&[^4]]   Subtraction using intersection and negation (matching 0-9 except 4)
[0-9--4]      Direct subtraction (matching 0-9 except 4)
[a-g~~b-h]    Symmetric difference (matching `a` and `h` only)
[\[\]]        Escaping in character classes (matching [ or ])

Composites:

xy    concatenation (x followed by y)
x|y   alternation (x or y, prefer x)

Repetitions:

x*        zero or more of x (greedy)
x+        one or more of x (greedy)
x?        zero or one of x (greedy)
x*?       zero or more of x (ungreedy/lazy)
x+?       one or more of x (ungreedy/lazy)
x??       zero or one of x (ungreedy/lazy)
x{n,m}    at least n x and at most m x (greedy)
x{n,}     at least n x (greedy)
x{n}      exactly n x
x{n,m}?   at least n x and at most m x (ungreedy/lazy)
x{n,}?    at least n x (ungreedy/lazy)
x{n}?     exactly n x

Empty matches:

^     the beginning of the text
$     the end of the text
\A    only the beginning of the text
\z    only the end of the text
\b    a Unicode word boundary (\w on one side and \W, \A, or \z on the other)
\B    not a Unicode word boundary

Timestamp functions

There is no dedicated datetime type: time is plain numbers. The functions in this section work in seconds since the UNIX epoch, carried as a float — that is what now() returns and what format_timestamp consumes. Validities — the timestamped assert/retract values behind time travel — count in integer microseconds. The two units are a factor of a million apart, and the engine will not bridge them for you.

?[utc, tokyo, back] := ts = 1751587200,
    utc = format_timestamp(ts),
    tokyo = format_timestamp(ts, 'Asia/Tokyo'),
    back = parse_timestamp(utc)

["2025-07-04T00:00:00+00:00", "2025-07-04T09:00:00+09:00", 1751587200.0]

Caution

now() and parse_timestamp() cannot be used directly in a validity position. Their float seconds are a million times smaller than the integer microseconds a validity stores, so a value meant for 2025 denominates as a moment in January 1970. Through mnestic 0.12.1 the engine coerced the float silently — a :put stored the row at 1970 and returned success, and an @ now() read returned no rows and no error. Since 0.12.2 every validity position rejects a float: the @ selector and :as_of (parser::float_validity_spec), the validity(...) constructor, and a Validity column on the write path (eval::float_validity).

Bridge the units with to_int, which truncates a float to an integer:

?[micros] := micros = to_int(parse_timestamp('2025-07-04T00:00:00Z') * 1000000)

[1751587200000000]

round(), floor() and ceil() all return floats and will not convert it. A Validity column default is written default [to_int(now() * 1000000), true], or, more simply, default 'ASSERT'.

Since 0.13.0 the idiomatic bridge is the typed dt_to_validity: it does the seconds→microseconds conversion inside the function, where the unit is known, and returns a Validity that @ and :as_of accept directly — @ dt_to_validity(parse_timestamp('2025-07-04T00:00:00Z')).

`now()`

Returns the current timestamp as seconds since the UNIX epoch (a float — see the callout above before putting it in a validity). The resolution is much coarser on WASM targets than the others.

`format_timestamp(ts, tz?)`

Interprets ts as seconds since the epoch and formats it as a string according to RFC3339. If ts is a validity, its timestamp will be converted to seconds and used.

If a second string argument is provided, it is interpreted as a timezone and used to format the timestamp.

`parse_timestamp(str)`

Parses str into seconds since the epoch according to RFC3339. The result is a float, in seconds — to_int(parse_timestamp(str) * 1000000) is the microsecond integer a validity wants.

`validity(ts_micro, is_assert?)`

Returns a validity object with the given timestamp in microseconds. If is_assert is true, the validity is an assertion, otherwise a retraction. Defaults to true.

?[v, ts] := v = validity(1751587200000000), ts = to_int(v)

[[1751587200000000, true], 1751587200000000]

The timestamp must be an integer. Since mnestic 0.12.2 a float is an error rather than a value coerced a millionfold too small:

?[v] := v = validity(parse_timestamp('2025-07-04T00:00:00Z'))

eval::throw: Evaluation of expression failed
help: 'validity' expects an integer number of MICROSECONDS since the Unix
epoch, got the float 1751587200. now() and parse_timestamp() return float
SECONDS: write validity(to_int(<expr> * 1000000)). (round() returns a float and
will not convert it.)

Datetime functions

mnestic

Specific to mnestic 0.13.0. mnestic markets a bitemporal database; its datetime standard library was three functions with inconsistent units. The dt_* family fills in the missing surface on one loudly-stated convention: timestamps are float Unix seconds (what now() and parse_timestamp return), a float second-count is not a validity (validities count integer microseconds), and the one bridge between the two worlds is dt_to_validity. Timezone-sensitive functions take an optional trailing IANA-name string and default to UTC. The dt_* names are reserved against user registration.

Component extractors

dt_year, dt_month, dt_day, dt_hour, dt_minute, dt_second, dt_dow (ISO day-of-week, Monday = 1), and dt_doy (day-of-year) each take (ts, tz?) and return an Int, interpreting ts as float Unix seconds in the given timezone (default UTC).

?[y, mo, d, dow] := ts = parse_timestamp('2025-07-04T09:30:00Z'),
    y = dt_year(ts), mo = dt_month(ts), d = dt_day(ts), dow = dt_dow(ts)

[2025, 7, 4, 5]

`dt_trunc(ts, unit, tz?)`

Truncates ts down to the start of the given unit — one of year, quarter, month, week (ISO, Monday-based), day, hour, minute, second — and returns float seconds. DST is handled rather than hand-waved: for sub-day units an ambiguous local time resolves within its own fold, and a local time landing in a spring-forward gap resolves to the first valid instant after it. Range edges raise a loud error instead of panicking.

`dt_add(ts, n, unit)`

Adds n (an integer, possibly negative) units to ts, returning float seconds. month, quarter and year are calendar-aware and clamp to month end (Jan-31 + 1 month = Feb-28/29); week, day, hour, minute, second are fixed durations. Overflow errors rather than panicking.

`dt_diff(a, b, unit)`

Returns the signed difference a − b in whole units as an Int, truncating toward zero and antisymmetric (dt_diff(a, b, u) == -dt_diff(b, a, u)). month/quarter/year compute the calendar magnitude consistent with dt_add's forward clamping, so dt_diff('2024-01-30', '2024-03-31', 'month') is -2, not -3.

`dt_format(ts, fmt, tz?)`

Formats ts with a strftime format string. The format is pre-validated — an invalid specifier is a loud error where calling the underlying library directly would panic — because dt_format is expected to receive LLM-authored text. format_timestamp stays the RFC3339 formatter.

`dt_to_validity(ts_seconds, is_assert?)`

The typed bridge between the seconds world and the microseconds world: it takes float Unix seconds, does the ×1,000,000 conversion inside the function where the unit is known, and returns a Validity (asserting when is_assert is true, the default). Since 0.13.0 @ and :as_of accept a Validity-typed expression, so this is the idiomatic way to time-travel to a wall-clock instant:

?[name, title] := *emp{name, title @ dt_to_validity(parse_timestamp('2024-06-01T00:00:00Z'))}

Together with 0.12.2's rejection of a raw float in a validity position, this closes the seconds-vs-microseconds trap: the untyped misread errors loudly, and the typed path carries the unit. The seconds-as-Int form (@ 1704067200) remains inherently ambiguous — valid time is an abstract clock, so no magnitude gate can reject it — which is exactly why the typed bridge exists. Both dt_to_validity and validity() reject the reserved i64 extremes ('NOW' / 'END' and the terminal retract sentinel) that every other write path fences.

Interval functions

For half-open spans [start, end) stored as two-element lists of numbers (sessions, shifts, validity windows), tested for overlap without decomposing into bound-by-bound comparisons.

mnestic

interval_overlaps is a mnestic addition (0.10.1), together with the interval_coalesce aggregate that merges a group's overlapping and adjacent spans — see Aggregations.

`interval_overlaps(a, b)`

Returns true if the two intervals share at least one point, under half-open [start, end) semantics: touching intervals do not overlap, and an empty interval [x, x) contains no point, so it overlaps nothing. Mixed Int/Float bounds are compared numerically.

?[touching, apart, nested, empty] :=
    touching = interval_overlaps([0, 5], [5, 10]),
    apart    = interval_overlaps([0, 5], [7, 10]),
    nested   = interval_overlaps([0, 10], [2, 3]),
    empty    = interval_overlaps([3, 3], [0, 10])

[false, false, true, false]

Malformed spans — wrong shape, non-numeric or NaN bounds, start > end — raise an error rather than silently comparing as false:

?[x] := x = interval_overlaps([5, 0], [0, 10])

eval::throw: Evaluation of expression failed
help: 'interval_overlaps' got a malformed interval (start > end): [5, 0]

Adapted from the CozoDB documentation by Ziyang Hu and the Cozo Project Authors, used under CC‑BY‑SA‑4.0. Adaptations for mnestic are released under the same license. mnestic is an independent fork and is not affiliated with or endorsed by the original authors.