Queries

CozoScript, a Datalog dialect, is the query language of the engine. A query consists of one or many named rules, and each named rule represents a relation: a collection of data divided into rows and columns. The rule named ? is the entry to the query, and the relation it represents is the result of the query. Each named rule has a rule head, which corresponds to the columns of the relation, and a rule body, which specifies the content of the relation, or how the content should be computed.

The examples on this page run against a small agent-memory graph used throughout these docs: episodic memories, the entities they involve, and weighted association edges between memories.

memory   { id => kind, text, importance, at, v }   # notes, decisions, insights
entity   { id => name, kind }                      # people, tools, projects
mentions { memory, entity }                        # which memories involve whom
recalls  { from, to => strength }                  # association edges

Rules can apply other rules, including themselves, which is what lets a few lines of Datalog replace a loop of round-trips in application code. Starting from the decision m2 ("Chose RocksDB over sled for the write path"), this query follows recalls edges to every memory downstream of it:

downstream[id] := *recalls{ from: 'm2', to: id }
downstream[id] := downstream[via], *recalls{ from: via, to: id }
 
?[id, text] := downstream[id], *memory{ id, text }

['m3', 'Nightly compaction stalls search-service around 03:00']
['m4', 'Compaction stalls correlate with oversized SST files']
['m5', 'Cap SST file size at 128 MB']

The first rule collects the direct successors of m2; the second, recursive rule extends the set by one hop until nothing new is reached; the entry rule joins the ids back to their text. Every construct in this query is covered in detail below.

Relations, stored or otherwise, abide by set semantics: even if a rule computes a row multiple times, the resulting relation only contains a single copy. In the query above, m5 is reached both directly and through the chain m3 → m4 → m5, yet appears once.

There are two types of named rules in CozoScript:

Inline rules, distinguished by using := to connect the head and the body. The logic used to compute the resulting relation is defined inline.
Fixed rules, distinguished by using <~ to connect the head and the body. The logic is fixed according to which algorithm or utility is requested.

The constant rules which use <- to connect the head and the body are syntax sugar. For example:

seeds[id] <- [['m3'], ['m4']]

is identical to:

seeds[id] <~ Constant(data: [['m3'], ['m4']])

Inline rules

An inline rule whose body joins three atoms — which memories record decisions, and which entities they concern:

decisions_about[name, text] := *mentions{ memory: id, entity: e },
                               *entity{ id: e, name },
                               *memory{ id, text, kind: 'decision' }
 
?[name, text] := decisions_about[name, text]

['RocksDB', 'Cap SST file size at 128 MB']
['RocksDB', 'Chose RocksDB over sled for the write path']

The rule body of an inline rule consists of multiple atoms joined by commas, and is interpreted as representing the conjunction of these atoms.

Atoms

Atoms come in various flavours. The first is the rule application:

mentioned[e, m] := *mentions{ entity: e, memory: m }
 
?[m] := mentioned['e_pg', m]

['m6']
['m7']
['m8']

In the entry rule, mentioned['e_pg', m] applies the rule named mentioned, which must exist in the same query and have the correct arity (2 here). Each row in the named rule is then unified with the bindings given as parameters in the square bracket: here the first column is unified with a constant string, and unification succeeds only when the string completely matches what is given; the second column is unified with the variable m, and as the variable is fresh at this point (because this is its first appearance), the unification will always succeed. For subsequent atoms, the variable becomes bound: it takes on the value of whatever it was unified with. When a bound variable is unified again, the unification only succeeds when the unified value is the same as the current value. Thus, repeated use of the same variable in named rules corresponds to inner joins in relational algebra: that is what id does in the decisions_about rule above.

Atoms representing applications of stored relations are written with an asterisk before the name:

?[from, to] := *recalls[from, to, _]

Written this way using square brackets, as many bindings as the arity of the stored relation must be given. If some columns do not need to be bound, use the special underscore variable _ (here, the strength column): it does not take part in any unifications.

You can also bind columns by name:

?[id, importance] := *memory{ id, importance }, importance > 0.8

['m2', 0.9]
['m5', 0.85]

In this form, any number of columns may be omitted, and columns may come in any order. If the name you want to give the binding is the same as the name of the column, *memory{ id } is shorthand for *memory{ id: id }; to rename, write *memory{ id: memory_id }.

Expressions are also atoms, such as importance > 0.8 above. Here importance must be bound somewhere else in the rule. Expression atoms must evaluate to booleans, and act as filters: only rows where the expression evaluates to true are kept.

Unification atoms unify explicitly. Whatever appears on the left-hand side must be a single variable, and it is unified with the result of the right-hand side:

?[line] := *memory{ id: 'm5', kind, text }, line = concat(kind, ': ', text)

['decision: Cap SST file size at 128 MB']

Note

This is different from the equality operator ==, where the left-hand side is a completely bound expression. When the left-hand side is a single bound variable, the equality and the unification operators are equivalent.

Unification atoms can also unify a variable with each value in a list, using in:

?[id, text] := id in ['m2', 'm5'], *memory{ id, text }

['m2', 'Chose RocksDB over sled for the write path']
['m5', 'Cap SST file size at 128 MB']

If the right-hand side does not evaluate to a list, an error is raised.

Head

As explained above, atoms correspond to either relations, projections or filters in relational algebra. Linked by commas, they therefore represent a joined relation, with columns either constants or variables. The head of the rule, which in the simplest case is a list of variables, then defines the columns to keep in the output relation and their order.

Each variable in the head must be bound in the body (the safety rule). Not all variables appearing in the body need to appear in the head.

Multiple definitions and disjunction

For inline rules only, multiple rule definitions may share the same name, with the requirement that the arity of the head in each definition must match. The returned relation is then formed by the disjunction of the multiple definitions (a union of rows). Here flagged collects memories that record decisions, plus memories of high importance:

flagged[id] := *memory{ id, kind: 'decision' }
flagged[id] := *memory{ id, importance }, importance >= 0.8
 
?[id] := flagged[id]

['m2']
['m4']
['m5']

m2 and m5 satisfy both definitions but, by set semantics, appear once each.

You may also use the explicit disjunction operator or in a single rule definition:

?[id] := *memory{ id, kind: 'decision' } or *memory{ id, kind: 'insight' }

There is also an and operator, semantically identical to the comma , but with higher operator precedence than or (the comma has the lowest precedence).

Negation

Atoms in inline rules may be negated by putting not in front of them. Notes that do not concern Postgres:

?[id, text] := *memory{ id, text, kind: 'note' },
               not *mentions{ memory: id, entity: 'e_pg' }

['m1', 'Maya prefers pull requests under 400 lines']
['m3', 'Nightly compaction stalls search-service around 03:00']

When negating rule applications and stored relations, at least one binding must be bound somewhere else in the rule in a non-negated context (another safety rule); here id is bound by the *memory atom. The unbound bindings in negated rules remain unbound: negation cannot introduce new bindings to be used in the head.

Negated expressions act as negative filters, which is semantically equivalent to putting ! in front of the expression. Explicit unification cannot be negated unless the left-hand side is bound, in which case it is treated as an expression atom and then negated.

Recursion

The body of an inline rule may contain rule applications of itself, and multiple inline rules may apply each other recursively — the downstream rule at the top of this page is the canonical example. The only exception is the entry rule ?, which cannot be referred to by other rules, including itself.

Recursion cannot occur in negated positions (safety rule): r[a] := not r[a] is not allowed.

Caution

As CozoScript allows explicit unification, queries that produce infinite relations may be accepted by the compiler. One of the simplest examples is:

r[a] := a = 0
r[a] := r[b], a = b + 1
?[a] := r[a]
 
:timeout 1

It is not even in principle possible to rule out all infinite queries without wrongly rejecting valid ones. Protect against them with a :timeout (the :timeout 1 above aborts the query after one second with an eval::timeout error), or terminate a runaway query with ::kill (see System ops).

mnestic

Since mnestic 0.10.5, :timeout and ::kill genuinely stop such queries: the cancellation check runs throughout evaluation, so even a long enumeration that emits no rows aborts promptly. In earlier engines both could fail to take effect. See Interruptibility & query budgets.

Aggregation

In CozoScript, aggregations are specified for inline rules by applying aggregation operators to variables in the rule head:

?[entity, count(memory)] := *mentions{ memory, entity }

['e_maya', 1]
['e_pg', 3]
['e_rocksdb', 3]
['e_sam', 1]
['e_search', 1]

Any variables in the head without aggregation operators are treated as grouping variables (here entity), and aggregation is applied using them as keys. If you do not specify any grouping variables, the resulting relation contains exactly one row.

Aggregation operators are applied to the rows computed by the body of the rule using bag semantics. The reason for this complication is that with set semantics, ?[count(memory)] := *mentions{ memory } would return 1 if there are any matching rows and 0 otherwise. Instead, it counts every row the body produces:

?[count(memory)] := *mentions{ memory }

[9]

The mentions relation has nine rows (m6 mentions two entities), so the count is 9 even though only eight distinct values pass through memory. Set semantics still applies at rule boundaries: route the same binding through an intermediate rule and the duplicates collapse before the count sees them:

counted[memory] := *mentions{ memory }
 
?[count(memory)] := counted[memory]

[8]

Use count_unique instead of an intermediate rule when distinct counting is what you want (see Aggregations).

If a rule has several definitions, they must have identical aggregations applied in the same positions.

Aggregations are allowed for self-recursion for a limited subset of operators, the so-called semi-lattice aggregations (see Aggregations). This computes the minimum number of hops from m2 to every memory it can reach:

hops[to, min(n)] := *recalls{ from: 'm2', to }, n = 1
hops[to, min(n)] := hops[via, prev], *recalls{ from: via, to }, n = prev + 1
 
?[to, n] := hops[to, n]

['m3', 1]
['m4', 2]
['m5', 1]

Here the self-recursion of hops contains the min aggregation: each round of recursion folds new candidate distances into the smallest seen so far, and the computation stops when nothing improves. m5 comes out at distance 1 because of the direct shortcut edge m2 → m5, even though the chain through m3 and m4 also reaches it.

Caution

For a rule head to be considered semi-lattice-aggregate, the aggregations must come at the end of the rule head. Written as hops[min(n), to], min is considered an ordinary aggregation, and the engine does not reject the recursion. The rule is evaluated once, with its own relation still empty, so the recursive definition silently contributes nothing:

hops[min(n), to] := *recalls{ from: 'm2', to }, n = 1
hops[min(n), to] := hops[prev, via], *recalls{ from: via, to }, n = prev + 1
 
?[n, to] := hops[n, to]

[1, 'm3']
[1, 'm5']

Only the directly-reachable memories survive; m4 is missing. Keep semi-lattice aggregations at the tail of the head. Self-recursion through a genuinely ordinary aggregation such as count is caught and rejected with a "query is unstratifiable" error.

Fixed rules

The body of a fixed rule starts with the name of the utility or algorithm being applied, then takes a specified number of named or stored relations as its input relations, followed by options that you provide:

ranks[id, rank] <~ PageRank(*recalls[], theta: 0.85)
 
?[id, r] := ranks[id, rank], r = round(rank * 1000) / 1000
 
:sort -r
:limit 3

['m5', 0.115]
['m8', 0.048]
['m4', 0.047]

m5, the memory both association chains converge on, collects the most rank, agreeing with what the recursive queries above kept finding. The fixed rule's output is a relation like any other, so the entry rule here rounds the scores for display.

Here the stored relation *recalls is the single input relation expected. Input relations may be stored relations or relations resulting from rules. Each utility/algorithm expects specific shapes for its input relations; consult the Utilities & algorithms reference for each one's API.

In fixed rules, bindings for input relations are usually omitted, but sometimes if they are provided they are interpreted and used in algorithm-specific ways (for example in the DFS algorithm).

In the example above, theta is an option of the algorithm, which is required to be an expression evaluating to a constant. Each utility/algorithm expects specific types for its options; some options have default values and may be omitted.

Each fixed rule has a determinate output arity. The bindings in the rule head can be omitted, but if provided, you must abide by the arity — PageRank outputs two columns, so ranks[id, rank] is valid and ranks[id] is not.

mnestic

mnestic ships fixed rules of its own: ReciprocalRankFusion and MaximalMarginalRelevance for hybrid retrieval (0.8.0), and BudgetedTraversal for cheapest-first expansion under a node budget (0.12.0). Since 0.11.0, graph algorithms also accept a graph: option naming a cached projection instead of rescanning an edge relation. See Utilities & algorithms, Hybrid retrieval, and Graph projections.

Query options

Each query can have options associated with it:

?[id, text, importance] := *memory{ id, text, importance }
 
:sort -importance
:limit 3

['m2', 'Chose RocksDB over sled for the write path', 0.9]
['m5', 'Cap SST file size at 128 MB', 0.85]
['m4', 'Compaction stalls correlate with oversized SST files', 0.8]

All query options start with a single colon :. Query options can appear before or after rules, or even sandwiched between rules. Several query options deal with transactions and writing to stored relations (:create, :put, :rm, :ensure, :returning, …); those are discussed in Stored relations & transactions. The rest are explained below.

`:limit <N>`

Limit output relation to at most <N> rows. If possible, execution will stop as soon as this number of output rows is collected (early stopping).

`:offset <N>`

Skip the first <N> rows of the returned relation:

?[id, at] := *memory{ id, at }
 
:sort -at
:offset 2
:limit 2

['m6', 1751760000.0]
['m5', 1751673600.0]

The third and fourth most recent memories: sorting happens first, then the offset and limit are applied.

`:timeout <N>`

Abort the query if it does not complete within <N> seconds. The value may be fractional (:timeout 0.5 is 500 ms) and may be given as any expression that evaluates to a constant, so randomized timeouts are possible. :timeout 0 removes the block's own deadline.

There is no built-in default: a query with no timeout set anywhere runs until it completes or is stopped with ::kill.

mnestic

Since mnestic 0.10.5, the timeout is a real wall-clock budget, and a deadline can come from three places: the in-script :timeout, a per-call timeout (run_script_with_options in Rust, the timeout= keyword in Python, the timeout field on the HTTP query payload), and a Db-wide default (set_default_query_timeout, or --default-query-timeout on the server binary). The effective deadline is the minimum of whichever are set, so a :timeout can only tighten the budget, never extend it. Expiry raises a distinct eval::timeout error (::kill raises eval::killed), and a timed-out or killed mutable query rolls back cleanly with no partial writes. On wasm there is no monotonic clock, so :timeout is inert there. See Interruptibility & query budgets.

`:sleep <N>`

If specified, the query will wait for <N> seconds after completion, before committing or proceeding to the next query. Useful for deliberately interleaving concurrent queries to test complex logic. The value must be positive. Not supported on wasm.

`:sort <SORT_ARG> (, <SORT_ARG>)*`

Sort the output relation. If :limit or :offset are specified, they are applied after :sort. Specify each <SORT_ARG> as it appears in the rule head of the entry, separated by commas. You can optionally specify the sort direction by prefixing with + or - (minus denotes descending order):

?[entity, count(memory)] := *mentions{ memory, entity }
 
:sort -count(memory), entity

['e_pg', 3]
['e_rocksdb', 3]
['e_maya', 1]
['e_sam', 1]
['e_search', 1]

This sorts by mention count in descending order first, then breaks ties by entity id in ascending order.

Caution

Aggregations must be done in inline rules, not in output sorting. In the example above, the entry rule head must contain count(memory); memory alone is not acceptable as a sort argument.

`:order <SORT_ARG> (, <SORT_ARG>)*`

Alias for :sort.

`:assert none`

The query returns nothing if the output relation is empty, otherwise execution aborts with an error. Useful for transactions and triggers:

?[id] := *memory{ id, importance }, importance > 0.95
 
:assert none

`:assert some`

Execution aborts with an error if the output relation is empty; otherwise the query proceeds and returns its rows as usual. Useful for transactions and triggers. Consider adding :limit 1 to ensure early termination if you do not need to check all return tuples:

?[id] := *memory{ id, kind: 'decision' }
 
:limit 1
:assert some

['m2']

`:reorder <MODE>`

Controls the join reorder for this query. :reorder written uses the atom order exactly as you wrote it; :reorder greedy explicitly requests the default. Any other value is a parse error.

?[id, text] := *memory{ id, text }, *mentions{ memory: id, entity: 'e_pg' }
 
:reorder written

mnestic

The greedy join reorder landed in mnestic 0.10.5 and is on by default: the positive relation atoms of each conjunction are reordered by a deterministic min-new-variables heuristic, which removes the blow-up a naively ordered multi-join falls into. Result sets are unchanged: reordering a conjunction never changes what it means, only how fast it runs. One case opts out automatically: a bare :limit without :sort, so the returned subset stays the one the written order would produce. See Greedy join reorder.

`:as_of <TIMESTAMP>`

Pin the query to a past belief: every transaction-time-stamped relation atom in the block that lacks an explicit @ (tt: …) selector is read as it stood at <TIMESTAMP>. Explicit per-atom selectors win over :as_of. The option is scoped per query block; using it in a block that references no transaction-time-stamped relation is an error (a typo guard). Plain relations and valid-time-only relations are unaffected, so a query mixing them with tt-stamped relations is only partially reproducible.

The timestamp may be a number of microseconds since the epoch, an RFC 3339 string, a bare date such as "2026-06-24" (midnight UTC), or 'NOW'/'END', which both mean the current belief — transaction time is stamped by the engine at commit, so it can never be in the future.

:create belief { claim: String, tt: TxTime => confidence: Float }

?[claim, confidence] <- [['compaction stalls trace to oversized SST files', 0.8]]
 
:put belief { claim => confidence }

The engine stamps tt at commit time; a plain read returns the current belief:

?[claim, confidence] := *belief{ claim, confidence }

['compaction stalls trace to oversized SST files', 0.8]

Pinned to a moment before the row was committed, the same query returns no rows:

?[claim, confidence] := *belief{ claim, confidence }
 
:as_of "2026-06-24T09:00:00Z"

mnestic

Transaction time (the TxTime column type, engine-assigned commit timestamps, :as_of, and the ::history family of ops) landed in mnestic 0.10.0, alongside the valid-time axis Cozo already had. See Bitemporality for why both axes exist, Time travel for the valid-time model, and System ops for ::history, ::history_gc, and ::evict.

Adapted from the CozoDB documentation by Ziyang Hu and the Cozo Project Authors, used under CC‑BY‑SA‑4.0. Adaptations for mnestic are released under the same license. mnestic is an independent fork and is not affiliated with or endorsed by the original authors.