Release history

This page summarizes the mnestic fork releases. The authoritative, fully-detailed log lives in CHANGELOG-FORK.md.

0.13.1

The factorized count() rewrite ships on, and a dependency upgrade's silent casualty is repaired. A patch release: no new surface, one planner default, one regression closed.

The factorized count() rewrite is ON by default. It has been opt-in since 0.10.5 — not because it failed, but because the planner-regression suite that would catch it if it ever did was still being built. That suite's nightly tier has run green against LDBC's own count oracle every night since it first went green, so the default flips. An eligible single-clause count()-over-a-positive-join is counted without materializing the join; answers are identical by construction (the pass fires only on a provably exact decomposition and leaves every query it declines byte-identical), and set_query_factorization(false) restores the old behaviour. Measured on LSQB sf0.1 (SQLite, release), both paths returning the published count: q1 72.4 s → 1.05 s (~69×), q6 42.1 s → 0.31 s (~134×).

mnestic

A gate that only half-covered what it gated — found by another gate. The nightly tier's rewrite-ON arm ran the one query the rewrite had been benchmarked on, while the rewrite in fact fires on two; the main loop ran the default, and the default was OFF. So q1 had never been executed in the configuration this release makes standard. We found it because the zero-data plan-shape gate went red on two queries when we expected one. The tier now covers every firing query, pins the firing set, and asserts the default path actually fired — a silent decline would otherwise hide behind a green-but-slow run. The rule going forward: a default flip widens its coverage first.

0.13.0's expected-token parse hints work again on current dependencies. pest 2.8.0 made parse-attempt tracking opt-in and off by default. Our requirement was a caret 2.7.9, so anyone resolving fresh built 0.13.0 against 2.8.x — where the help: token list and the corrected caret position silently reverted to pre-0.13.0 behaviour. Our own CI never saw it: the committed lockfile pinned 2.7.x. The floor is now pest = "2.8", and the tracking is enabled only around a re-parse of a script that already failed — it is a pest-wide global with a documented performance cost, and an embedded engine should not leave it switched on inside its host's process. A weekly fresh-resolve CI lane now covers the class of regression a committed lockfile hides.
Source compatibility across the uuid 1.x range, so the crate builds against whichever uuid a consumer's lockfile resolves.

0.13.0

The correctness-union release, and the feature tranche it earned. 0.13.0 pairs a nine-bug hardening pass over the storage and index paths with the primitives a bitemporal, agentic-memory workload actually needs: budgeted hybrid retrieval in one call, a real datetime standard library with a typed validity bridge, and the restored != count rewrite. It ships with mnestic-rocks 0.1.10, which carries the fix below. Engine (cozo-core) plus the RocksDB bridge — CozoScript semantics are unchanged unless noted.

mnestic

The RocksDB table options you configure now actually reach RocksDB. For as long as the options-file path has existed, open_db loaded your BlockBasedTableOptions — block cache, block size, cache_index_and_filter_blocks — and then default-constructed a fresh one a few lines later, threw yours away, and reset the table factory with the default. Silently: no error, no warning. An embedded engine ran with a read cache two orders of magnitude smaller than its host asked for (RocksDB's 8 MB default), block_size at 4 KB, index/filter blocks uncached. block_cache_size was ignored outright, and dropped entirely unless an options file was also supplied. Fixed in mnestic-rocks 0.1.10 by seeding the table options from the factory already in effect. Every read-path benchmark taken before this — ours included — measured a slower engine than mnestic actually is. It changes what RocksDB writes, not only what it caches (new SSTs pick up the configured block_size), but it is forward-compatible and needs no migration. Inherited from upstream Cozo.

HybridSearch gains a budgeted-expansion mode and optional legs. The 0.12.0 BudgetedTraversal primitive is now reachable from the one-call hybrid_search surface (the dict the PyPI wheel already exposes). Setting a graph leg's max_nodes switches it from the recursive min-hop rule to cheapest-first weighted expansion under a global distinct-node budget — with max_cost, an exact layered-label depth bound, an optional liveness gate, and graph: naming a pre-created cached projection (the production path). Seeds default to the union of the vector/FTS legs' own top-k. And vector_index/fts_index are now optional: configure any non-empty subset of the vector, FTS, graph and extra-list legs and only those are generated and fused — a payload missing its leg is a loud error, never a silently dropped signal. HybridSearch/GraphLeg are now #[non_exhaustive] (construct with Default + field mutation); the wheel's dict surface gains optional keys only, so existing dicts parse unchanged.
A datetime standard library (dt_*) and a typed validity bridge. We market a bitemporal database; its datetime surface was three functions with inconsistent units. This adds component extractors (dt_year … dt_dow/dt_doy), dt_trunc, calendar-aware dt_add/dt_diff, strftime dt_format, and — the keystone — dt_to_validity, which converts float Unix seconds to a Validity inside the function, where the unit is known. @ and :as_of now accept a Validity-typed expression (@ dt_to_validity(parse_timestamp('2024-01-01'))), so together with 0.12.2's float rejection the seconds-vs-microseconds trap is closed: the raw-float misread errors loudly, and the typed path is the idiomatic spelling. Timezone-sensitive functions take an optional trailing IANA name (default UTC); parse_timestamp is widened to accept RFC3339, "YYYY-MM-DD hh:mm:ss", and bare "YYYY-MM-DD", with validity string literals sharing the same grammar. The dt_* names are now reserved against user registration.
The != inclusion–exclusion count rewrite is restored, behind a type gate (default OFF). The factorized-count != extension built in 0.10.5 and cut 34 minutes later for a silent Int/Float miscount is back, sound this time: the rewrite fires only when every occurrence of both inequality operands is a declared, non-nullable, variant-stable stored column agreeing on one type — so join equality and op_neq cannot disagree. Measured on LSQB q6 (sf0.1, sqlite): 41.7 s → 0.30 s, ~140×, with the count exactly LSQB's published oracle either way. Default stays OFF (Db::set_query_factorization) pending a nightly soak; the nightly LSQB tier now forces the toggle on and asserts the rewrite actually fired.
Errors that name what the engine expected. A failed parse now points its caret at the deepest position the parser reached and lists the literal tokens that would have been accepted there (expected one of: :=, <-, <~). Index-search diagnostics carry the code of the index kind that actually failed (fts_query_required instead of hnsw_query_required, and so on). And import_from_backup now refuses a schema mismatch — the one user-reachable way to put a value at rest that violates its column's declared type, which is exactly the invariant the != gate rests on.
A nine-bug correctness union — the hardening pass. BM25 counted N over the base relation by re-scanning the whole relation on every query (a 30× score error from non-document rows, and O(corpus) latency that made the FTS leg 2,621 ms at 400k chunks); a no-op :put drifted the FTS doc-count upward (17×); pre-epoch timestamps could panic and poison an in-memory/SQLite database; HNSW rebuilds mishandled zero/NaN/removed vectors; hybrid search conflated fusion legs with graph seeds; restore/open could silently reuse a live relation id; corrupt value blobs and corrupt HNSW/FTS index rows now surface as ordinary query errors instead of process panics; and ::reindex/::repair_corrupt now participate in imperative-program lock planning. Every one is inherited from upstream and was latent before the fork point.

→ Hybrid retrieval · Datetime functions · Time travel · What mnestic adds

0.12.2

A float in a validity is now an error, in all four places it was silently accepted. Validity and transaction-time stamps are integer microseconds since the Unix epoch. now() and parse_timestamp() return float seconds. The engine's integer accessor accepted any integral float and coerced one unit into the other without a word, so a timestamp meant for 2024 was denominated a million times too small and landed in 1970. One inherited bug, four sites: the @ valid-time selector, the @ (tt: …) / :as_of transaction-time selector, the validity(...) constructor, and the write path. Engine (cozo-core) only — the public Rust API is byte-identical to 0.12.1, and no grammar, planner or mnestic-rocks change.

The bug predates the fork, and it is as old as Cozo's time travel. Three of the four sites are verbatim upstream code at the fork point (481af05, 2024-12-04): the @ selector (expr2vld_spec), the validity(...) constructor (op_validity), and — worst — the write path itself, whose DataValue::List arm of ColType::Validity is byte-identical to upstream's. So is the accessor all three funnel through: Num::get_int, which coerces any whole-numbered float to an i64. Only the transaction-time selector is ours, and it inherited the coercion rather than introducing it — 0.10.0 extended the same accessor onto a new axis. Validity columns and @ time travel are a Cozo feature that predates the fork by years, so this is not something bitemporality broke: every CozoDB database with a Validity column has it, and the upstream code that writes 1970 still ships today. Nobody was reading it. That is what the fork is for.

Caution

Upgrade action, if any Validity column has a default built on now(). Validity default [floor(now()), true] — and every spelling of it, since now(), parse_timestamp(), round(), floor() and ceil() all return floats — has been stamping every row it wrote at 1970, in the valid-time axis, invisibly. As of 0.12.2 that schema still compiles but the next write raises eval::float_validity instead of corrupting another row. Multiply through to microseconds and truncate:

last_seen: Validity default [to_int(now() * 1000000), true]

Or use default 'ASSERT', which was always correct. Rows already written carry a valid time no upgrade can repair — they read back normally and are wrong only under time travel, so check any relation whose validity came from a float before you trust its history.

The write path stored 1970 and reported success. A :put of [parse_timestamp('2024-06-01T00:00:00Z'), true] into a Validity column succeeded and stamped the row at January 1970. The row reads back correctly on an ordinary query; the damage is visible only under a time-travel read — precisely where a bitemporal database is supposed to be trustworthy. That is permanent at-rest corruption, and it is the reason this release exists. The column now raises eval::float_validity, naming the unit and the fix.
The read paths returned nothing, and said nothing. @ parse_timestamp('2024-06-01T00:00:00Z') read the relation at a moment in 1970 — before any row was asserted — and returned zero rows with no error, indistinguishable from "no data yet". @ 1e300 was accepted and saturated to i64::MAX, silently querying the end of time. Both, plus the tt: selector and :as_of, now raise parser::float_validity_spec, which names the axis, the 1,000,000× magnitude, and the spelling that works: to_int(<expr> * 1000000).
validity(<float>) is rejected too. The constructor's own documentation said "integer"; it accepted a float and coerced it. It now means what it said.
We found exactly one caller of the broken idiom — upstream's own test suite. The HNSW test we inherited unchanged, and which upstream still ships, declared Validity default [floor(now()), true] and had been writing 1970 into upstream's own valid-time axis for as long as the test existed. It never asserted on the value, so nothing ever went red. If it was in the test suite we inherited, it is in someone's schema.
What this deliberately does not fix. An integer in seconds (@ 1704067200) is still accepted and still returns nothing, silently. Valid time is an abstract, user-settable logical clock — the tutorial itself queries @ 2019 — so no magnitude check can distinguish a wrong-unit timestamp from a legitimately small one, and a gate at 1e12 breaks fifteen of the engine's own tests. The real answer is a typed path, which arrives with the datetime library in 0.13.0. Until then: integer microseconds is the low-level form, and the string forms (@ '2024-06-01', @ '2024-06-01T12:00:00Z') are unambiguous and safe.

→ Time travel · Timestamp functions

0.12.1

Six correctness bugs inherited from upstream CozoDB — and the repair path for the worst of them. Not one is a regression the fork introduced: every one of them predates the fork point and has been latent in the engine for years. We found them in a line-by-line audit of the code we inherited, and we are naming them rather than fixing them quietly. That is what maintenance means, and it is the difference a dormant upstream cannot offer: a project nobody is reading ships none of these fixes. Engine (cozo-core) only — no cozorocks/mnestic-rocks change, no planner change, and the grammar gains exactly one system op.

Caution

Upgrade action, if you use full-text search. The FTS fix below stops new leakage but cannot evict postings that are already written. A relation carrying only an FTS index that has ever been updated in place is affected today, and upgrading alone does not repair it. Rebuild it once with the new ::reindex:

::reindex my_relation

::reindex <relation> (new). Rebuild a relation's HNSW / FTS / LSH indexes in place, from the index configuration the database already stores. It is the repair path for the FTS leak below, and it replaces the "drop and recreate the index" advice the bulk-load paths used to give — which meant reconstructing the original ::hnsw/::fts creation script (extractor, tokenizer, filters, ef_construction, m_neighbours…) by hand from ::indices output. Each index is rebuilt against its own stored manifest, which matters for LSH: the manifest keeps the derived band geometry but not the weights that produced it, so a drop-and-recreate would silently hand back an index with a different recall profile than the one you asked for. One write transaction, holds the relation's write lock, never auto-invoked; a relation with no such index is a loud no-op, so it stays scriptable across a set of relations.
Full-text postings leaked when a row was updated in place (affects every release through 0.12.0). Deleting a row's old postings was gated on a flag that counts only plain B-tree secondary indexes, so a relation carrying only an FTS index never deleted the old document's postings on a :put over an existing key. Terms the document no longer contained kept matching it, the index grew without bound, and the BM25 df/avgdl statistics drifted — measured at a 55% score error on a two-document corpus. :rm and update were unaffected. Results change on an affected relation, which is the point: there is deliberately no flag to restore the old behavior, because the old behavior was a leak. LSH does not leak despite sitting behind the same gate — its write path is self-cleaning.
The transaction API reported success for a failed commit. MultiTransaction::commit() discarded the Result the transaction thread sends back, so a commit that errored returned Ok(()) and the caller believed its data was durable. The HTTP /transact endpoint sat directly on top of it, answering 200 {"ok": true} for transactions that never committed. abort() had the same shape.
Change callbacks fired after a failed commit. In the multi-statement transaction path, subscribers received Put/Rm events for rows that were never committed — so anything syncing off the change feed (a search mirror, an audit log, a cache) could silently diverge from the database. Callbacks now dispatch only on a successful commit, honoring the contract register_callback always advertised.
Two non-default backends fixed. newrocksdb ran an optimistic transaction DB but never armed conflict validation, so two transactions could read a key, both write it, and both commit — one acknowledged write silently vanished. sled's del() never deleted (upstream #306): it wrote the put marker where the delete marker belonged, so exists kept answering true and the commit re-inserted the key. Both backends now run a transaction-contract suite in CI; until 0.12.1 nothing in CI even compiled them.
import_from_backup silently stranded HNSW/FTS/LSH indexes. It guarded only against B-tree indexes, so an operator could restore a backup and have hybrid retrieval quietly return nothing for the restored rows, with no signal anywhere. It now warns, as import_relations already did — and both warnings now point at ::reindex.

→ System ops · Proximity search · Embedding mnestic

0.12.0

Budgeted weighted traversal. BudgetedTraversal is a new graph-algo fixed rule: cheapest-first expansion from a set of seeds, over non-negative edge weights, under a required global budget of distinct nodes — the missing primitive for filling a fixed context window with the cheapest graph neighborhood around what search found. Pure cozo-core and purely additive: one new reserved rule name, no grammar or planner change, no cozorocks/mnestic-rocks change, and no change to any existing query.

The max_nodes cheapest distinct admissible nodes, with evidence. Each admitted node emits (node, cost, parent, depth) — the settled shortest-path-tree fragment — deterministic by construction: admission follows a (cost, node) total order and the parent/depth witness resolves by strict lexicographic relaxation. Positional edges and a cached graph: projection produce byte-for-byte identical output. Weights are consumed as costs — monotone transforms like -ln(weight) are the caller's.
max_cost bounds admissible path cost; max_depth is an exact hop bound — layered labels, never depth-pruned Dijkstra, so a node whose cheapest path overshoots the hop bound is still found through its cheapest within-bound path.
An optional gate relation plus admit: predicate filters mid-expansion. A gated-out node spends no budget and never bridges to what lies beyond it — semantics a host-side post-filter cannot reproduce.
Interruptible, and measured. The loop honors :timeout / ::kill, and at the release's merge gate one call over a cached projection ran 2–4× faster than the production host-side BFS it replaces.
The optional rayon dependency is now bounded >=1.10, <1.11 — rayon 1.11 breaks graph_builder 0.4.x, the CSR-builder crate behind graph-algo, so a fresh downstream resolve now lands on a working pair.

→ Utilities & algorithms · Graph projections

0.11.1

Built-in skyline aggregates. pareto_min and pareto_max keep, per group, the Pareto frontier of a numeric vector — the points no other point dominates — so a query can surface a contested set of equally-good answers instead of collapsing to a single winner. Pure cozo-core — no cozorocks/mnestic-rocks change and no change to any existing query.

pareto_min(v) / pareto_max(v) — the skyline, per group. Given a numeric vector v per row, each keeps the non-dominated frontier under componentwise order — pareto_min treats smaller as better on every component, pareto_max larger — emitting one row per survivor. Mixed objectives (minimize price and maximize quality) are the sign-flip idiom: negate the maximized components and use pareto_min, e.g. v = [price, -quality].
No host registration, reachable from every binding. The dominance is native (componentwise), so unlike the registered antichain bounded-meet from 0.10.1 these need no register_bounded_meet_aggr — they work from the PyPI wheel, cozo-bin, langchain and llama-index through plain run_script. They also compose in recursive rules, inheriting the confluence and cycle-pruning of the dominance aggregate they build on.
A malformed operand is a loud error — a non-list, a non-numeric or NaN component, or an empty vector fails the query rather than silently dropping the row.

→ Aggregations

0.11.0

Cached graph projections. ::graph create g { edges: knows, nodes: person } names an in-memory adjacency that twelve graph algorithms reuse across queries via a graph: option, instead of scanning the edges and rebuilding the CSR on every call. Engine plus the Python binding (for the capacity knob) — no query-result change except the PageRank default noted below.

New: name a graph once and reuse it. A projection is a named, in-memory adjacency over stored relations. Build it with ::graph create, then pass graph: 'g' to any of the twelve algorithms that take an edges relation (ConnectedComponents, SCC, PageRank, ClusteringCoefficients, TopSort, BetweennessCentrality, ClosenessCentrality, ShortestPathDijkstra, KShortestPathYen, MinimumSpanningTreePrim, MinimumSpanningForestKruskal, LabelPropagation, CommunityDetectionLouvain) in place of the positional edge relation. ::graph list and ::graph drop inspect and free it.
Always fresh, never stale. A projection never serves a transaction data that differs from what that transaction's own scan of the sources would return, and writing to a source frees what was built from it. Under write churn it degrades to build-per-query; it never goes stale. Projections are in-memory and are not persisted — re-create them after a restart.
Measured on a 400,000-edge graph (cold = the previous positional form): ConnectedComponents 127 ms → 7.9 ms (16×), PageRank at 20 iterations 150 ms → 10 ms (15×), ClusteringCoefficients 169 ms → 56 ms (3×). What is cached is the setup — scanning edges and building the CSR — so the gain shrinks as the kernel dominates.
BREAKING (results): PageRank's default iterations is now 20, up from 10. Ten was a below-upstream default and measurably non-convergent at the default epsilon. Pass iterations: 10 to restore the old numbers. PageRank now also warns when the iteration cap stops it short of epsilon.
PageRank accepts an optional node relation, as ConnectedComponents already did, so vertices with no edges are ranked instead of silently dropped (they enter N, which moves every rank).
A 512 MiB memory ceiling on cached adjacencies, settable from Rust (set_graph_projection_capacity) and Python; 0 disables caching while leaving ::graph create/list/drop working.
Fix: an empty edge relation no longer aborts the process in seven graph algorithms (TopSort, ConnectedComponents, StronglyConnectedComponents, ClusteringCoefficients, BetweennessCentrality, ClosenessCentrality, LabelPropagation) — they now return no rows, as PageRank already did.
Fix: multi_transaction could deadlock a process by parking a rayon worker for the transaction's whole lifetime; it now runs on a dedicated thread. Affects every caller of that API, not just graph algorithms.

0.10.7

Two targeted patches on top of 0.10.6. A join-reorder plan-quality fix restores a cyclic-join query the 0.10.5 reorder could regress, and the Python binding gains the factorization kill switch that was Rust-only. Engine plus the Python binding — no cozorocks/mnestic-rocks change and no query-result change.

Fix: the greedy join reorder no longer demotes a full-composite-key filter to a partial-key expansion. A tie-break bug in full_key_lookup_bonus could pull a high-fan-out edge ahead of a more selective atom, regressing a cyclic-join query — a benchmarker measured LDBC-SNB LSQB Q3 go from ~19s to a timeout, and the fix restores it. Result sets are unchanged and the min-new-vars speed-up from 0.10.5 is preserved.
Python: the factorization kill switch is now toggleable. db.set_query_factorization(True) and db.query_factorization() are now exposed on the Python binding, so the 0.10.5 factorized-count() rewrite — previously reachable only from Rust — can be flipped from Python. The default stays off.

0.10.6

An urgent upgrade-safety patch. The headline fixes a data-availability regression the 0.10.0 bitemporality work introduced: a relation catalog last written before 0.10.0 (or by an index/rename/destroy path) could fail to open, taking the whole database down. Anyone who upgraded a pre-0.10.0 database to any of 0.10.0–0.10.5 should upgrade to 0.10.6. Engine (cozo-core) plus Python-wheel CI — no cozorocks/mnestic-rocks change and no query-behavior change.

Fix: relation catalogs written before 0.10.0 no longer fail to open ("Cannot deserialize relation metadata from bytes"). The 0.10.0 bitemporality work inserted a new RelationHandle field mid-struct, and the pre-0.10.0 catalog-write paths encode the struct positionally — so #[serde(default)], which only rescues a missing trailing element, could not recover a relation whose catalog was last written as the older 13-field positional array (any graph created before 0.10.0, or updated by an index/rename/destroy path). It failed to deserialize on open, taking the whole database down. The two-part fix moves the new field to the last position so the trailing default applies to legacy layouts, and switches the seven catalog-rewrite paths (::index/HNSW/FTS/LSH create, relation rename, index destroy) to a self-describing, field-named encoding like the create path — so a future field addition can't reintroduce this class of bug. No migration: legacy catalogs stay readable and re-canonicalize to the self-describing form on their next write. Regression-guarded by a real pre-0.10.0 catalog fixture.
Greedy join reorder refactored to a pure function (internal). The deterministic join-reorder pass shipped in 0.10.5 is now a pure function over a resolved schema view, making it independently unit-testable. No query-plan or behavior change.
Python-wheel CI hardened for storage-rocksdb. The x86_64 manylinux leg now builds on manylinux_2_28 with libclang installed so zstd-sys's bindgen resolves; the aarch64, macOS and Windows legs are unaffected. Wheel-build only — no engine change.

0.10.5

A liveness and performance release: queries you can always stop, and naively-ordered queries that stop being pathological. Engine plus the Python binding and its wheel CI — no cozorocks/mnestic-rocks change.

Interruptible ::kill and ::running — both now dispatch before opening a storage transaction, so on the mem/sqlite backends a ::kill no longer queues behind the very read query it is trying to kill (they touch only the in-memory running-query registry). And the per-query poison flag is now checked every 4,096 pulls inside the relational-algebra enumeration — so a long single-rule join that yields no output for a while is finally interruptible, not just between rule applications. The Python close() is fixed too: it no longer raises "Already borrowed" while a run_script is live (the handle moved to interior mutability).
Per-query wall-clock budget — a query can carry a deadline three ways: the in-script :timeout <secs> option, a per-call run_script_with_options(…, ScriptRunOptions { timeout }), and a Db-wide set_default_query_timeout. The effective deadline is the minimum of whichever are set — a :timeout can tighten the budget but never extend past the Db default. Expiry raises a distinct eval::timeout diagnostic (a ::kill still raises eval::killed); a budget-aborted mutable script rolls back with no partial commit. Python run_script gains a timeout= kwarg. Wasm carries no wall-clock budget — it has no monotonic clock.
Deterministic greedy join reorder — default ON (opt out per query with :reorder written). No pass previously considered join order, so a naively-ordered conjunction — exactly the shape an LLM authors — could spin on an N³ intermediate. A stat-free min-new-vars greedy pre-pass (after the equality-pushdown pass) reorders the positive relation atoms of an eligible conjunction, measured 54.5× faster on the repro (N³ → N²). Results are unchanged: the pass is the identity on any stepwise-greedy-consistent written order, so hand-tuned plans stay byte-identical, and it excludes the multi-valued in-unifications that feed aggregation. It is not a cost-based optimizer — there are no cardinality stats. A residual Cartesian step (a genuinely disconnected conjunction) is warned and annotated (cartesian) in ::explain.
Automatic factorized count() rewrite — opt-in, default OFF (behind set_query_factorization). Rewrites an eligible single-clause count()-over-positive-join into per-key counting sub-rules — a bit-identical (exact-i64, Int-typed) answer computed without ever materializing the join (benchmarked 4–342× vs a factorizing optimizer). It fires only on shapes it can prove exact; a body with any != predicate declines to exact naive evaluation (the != inclusion-exclusion rewrite was cut before release for miscounting on mixed Int/Float data — the manual patterns live in the cardinality-algebra spec). An always-on detector adds a factorization advisory to ::explain.
RocksDB now ships in the PyPI mnestic wheel — CozoDbPy("rocksdb", path) works straight from pip install mnestic (the wheel was compact/SQLite-only before). The sdist stays compact, so the persist engine is wheel-only.
Bulk import_relations into an index-bearing relation now warns — the bulk path maintains B-tree secondary indexes but not HNSW/FTS/LSH, so imported rows stay invisible to vector/text search until the index is rebuilt. A warning now flags it (it stays a warning, not a hard error: importing a snapshot then reindexing is a legitimate workflow).

0.10.1

A small additive release on top of 0.10.0: two new query primitives plus a correctness fix to the meet aggregates. Pure cozo-core.

Dominance bounded-meet — the antichain / skyline aggregate. register_bounded_meet_aggr(name, dominates, max_survivors) lets a host register a strict partial order; the head form name(operand) then keeps, per group, the non-dominated (Pareto-frontier) set of operands — each survivor its own output row. Survivors are held in canonical memcmp order, so output is arrival-independent, and max_survivors is a mandatory resource guard: overflow is a loud error, never a silent truncation (an antichain has no canonical k-subset). Rust-embedded v1 — host closures do not cross the Python/served surfaces yet.
Interval primitives — interval_overlaps(a, b) (builtin function) and interval_coalesce(span) (aggregate) over half-open [start, end) list intervals. Touching spans do not overlap but do coalesce ([0,5) + [5,10) = [0,10)); empty spans [x, x) overlap nothing; mixed int/float bounds compare numerically; malformed spans are loud errors, never silent falses. interval_coalesce is deliberately named away from the shipped null-coalescing coalesce builtin.
bit_and / bit_or changed-bit fix — the meet aggregates now report whether the value actually changed; the byte loop previously returned true unconditionally, so a non-changing fold re-entered the semi-naive delta every epoch (the same defect family as the 0.10.0 and/or fix). The bounded-meet divergence cap was also hardened to count total changed epochs rather than a resettable consecutive streak.

0.10.0

Two pillars, in order of impact.

Bitemporality — engine-assigned transaction time alongside Cozo's valid time. System-versioned (tt: TxTime) and fully bitemporal (vld: Validity, tt: TxTime) relations: a crash-safe monotone commit clock stamps every write; reads default to the current belief and time-travel with @ (vt: …, tt: …) or the :as_of query option; existence-checking writes target the resolved current belief; and ::history / ::history_gc (persisted floor) / ::evict (audited hard deletion, the one deliberate break of append-only, for GDPR) manage the record's lifecycle. Current-belief reads stay within ~4–12% of the single-axis baseline. "What did we believe at time T about period Y" — in-engine.
Provenance semirings — the same recursive rules compute existence, cost, confidence, or evidence. register_custom_aggr admits user-defined absorptive combines into recursion; min_cost_k([payload, cost], k) is a bounded-meet aggregate returning the k best whole derivations per answer, with the evidence chains that justify them (k-shortest-paths, and "the k most-likely paths plus their exact evidence", fall straight out); and :reconcile is recompute-based belief revision that keeps derived annotations consistent under base-fact retraction, composed with the transaction-time axis.
Four upstream CozoDB bugs fixed along the way — an inverted changed-bit in the and/or meet aggregates, a panic on negated validity atoms, wrong answers from prefix-truncated temporal-column joins, and the braced-%return imperative parse panic.

0.9.0

Adds the read-only Cypher query surface and bundles the corrupt-database tooling that was banked as 0.8.6 but never separately published (0.8.5 → 0.9.0 ships both; there is no standalone 0.8.6 crates.io artifact).

Read-only Cypher query surface (alpha; opt-in cypher feature, off by default) — translate a subset of openCypher to CozoScript so the engine can be evaluated and adopted without first learning Datalog. New API: run_cypher / cypher_to_script, driven by a caller-supplied schema that maps the property-graph model onto stored relations (both the relation-per-label and the shared-relation-with-discriminator conventions). v1 covers MATCH / WHERE / RETURN (with count/sum/avg/min/max/collect) / ORDER BY / SKIP / LIMIT, with true bag semantics and null-aware WHERE. Datalog stays the native, full-power language — this is a read-only on-ramp (no write clauses). The published PyPI wheel ships without it for now.
Corrupt-database tooling (banked as 0.8.6) — ::repair_corrupt <relation> surgically deletes truncated tuples by their intact store keys, a surgical alternative to dropping a database that fails integrity checks. And ::index create now skips corrupt tuples with a loud error naming the relation and the arity mismatch, instead of panicking — one bad row previously made a database that (re)creates an index at startup unopenable.
Two cozo-bin fixes — bearer auth is now honored on query-string URLs (/transact?write=true no longer rejects a valid token), and the binary now builds bare with a runnable-out-of-the-box default feature combo (compact = sqlite + requests + graph-algo).

0.8.5

Index builds, 15× faster — plus the read path stops paying for transactions it doesn't need.

Flat in-RAM parallel index builds — ::hnsw create used to build the graph through the storage engine's tuple encoding: every neighbour visit paid a decode, a hash of the full compound key, and allocator traffic — a profile showed less than half the build was distance math. The build now works the way hnswlib and pgvector do: a contiguous vector slab, integer-ID adjacency arrays, parallel insertion with per-node locks, then one serialisation pass into the existing on-disk format. Nothing changes for readers: same tuple layout, same search path, same incremental maintenance, and the non-blocking build (reads proceed during construction) is preserved. Measured on the 40k × 384-dim RocksDB corpus: ::hnsw create 294 s → 19 s synthetic, 89.1 s → 8.1 s with real embeddings, recall@10 unchanged. Decomposition: 3.2× from the flat layout (serial), ~5× from parallel insertion. MNESTIC_INDEX_BUILD_THREADS=1 restores serial insertion; parallel builds produce equivalent-recall graphs, not byte-identical ones — guarded by a recall-agreement test. ::fts create gets the same treatment (no deletion pass against a provably-empty index, parallel tokenisation): ~2× on short documents, more on long ones.
Plain-snapshot read path (RocksDB) — read-only scripts no longer open a pessimistic transaction; they read through a plain snapshot (the standard MVCC read pattern — TiKV and CockroachDB made the same call). Same consistent view as before, no lock-manager bookkeeping, and reads structurally cannot wait on writer locks. Keyed point reads: p50 28.5 → 23.9 µs (−16%), p99 −19%. Retrieval-scale queries on a cache-resident corpus are unchanged, as expected — and a write attempted through a read-only transaction now errors explicitly. Isolation semantics are pinned by tests.
Batched HNSW neighbour reads — hnsw_search_level fetches unvisited neighbours' vectors through one RocksDB MultiGet per expansion step (shared bloom probes, batched block reads) instead of one serial point get per neighbour. Neutral on a cache-resident corpus; the win case is cold-cache and larger-than-RAM data.
::describe works now — documented, implemented, and tested at the operation level upstream, but never reachable: the grammar rule was missing from the script-level alternations, since before the fork. Wired in, with a read-only-mode guard. A small fix, and a fair illustration of what "maintained" means: someone is reading this code.
Requires mnestic-rocks 0.1.9 for the rocksdb feature.

0.8.4

A defect fix plus the fusion-explainability primitive. Also the first release with the Python family on PyPI: mnestic (abi3 wheels for Linux x86_64 + aarch64, macOS x86_64 + arm64, Windows), langchain-mnestic, and llama-index-vector-stores-mnestic.

Per-leg fusion detail — ReciprocalRankFusion(..., detailed: true) switches the output to one row per (item, contributing list): [item, fused_score, list_id, leg_rank, leg_score], where leg_rank is the 1-based within-list rank the fusion actually used. The fused score reconstructs exactly as Σ 1/(k + leg_rank). HybridSearch::detailed plumbs it through the one-call helper (Python: detailed=True) — the mechanism behind "why was this retrieved" surfaces.
Concurrency fix — 0.8.3's durable avgdl counter was one shared storage key written inside every document transaction; under RocksDB pessimistic transactions, concurrent writers to any FTS-indexed relation conflicted on that key (and the unlocked read-modify-write lost updates). The doc-stats counter is now process-cached and scan-seeded — one scan per index per process, maintained in memory, no shared hot-path key. Per-query avgdl stays O(1); BM25 scores are unchanged.

0.8.3

Two agentic-memory wedge features, validated end-to-end on the hybrid-recall benchmark (40k chunks, vs SQLite / DuckDB / LanceDB / Kuzu).

Native 3-way fused recall — graph proximity is now a typed GraphLeg on hybrid_search. Each leg expands from seed nodes over a stored edge relation up to max_hops, scores every reached node by its minimum hop distance, and feeds that ranked list into the same RRF as the vector and keyword legs — one call, one transaction, no hand-written recursion. The fused call runs all three signals at 41.55 ms p50 (recall 0.873), ~4× faster than the hand-decomposed path, fusing a signal no 2-way engine can.
BM25-correct full-text search — the default ::fts score kind is now Okapi BM25 (term-frequency saturation k1 + document-length normalization b), and OR now sums per-term contributions instead of taking the max. avgdl is an O(1) read instead of a per-query index scan (the 0.8.3 durable-counter design was replaced by a process-level cache in 0.8.4 — see above). Fused recall 0.75 → 0.954 (parity with DuckDB's 0.957); cold p99 tail 2,900 → 258 ms.

Caution

Behavior change: the default ::fts scorer moved from tf_idf to bm25. Pass score_kind: 'tf_idf' (or 'tf') for byte-identical upstream scoring.

→ Hybrid retrieval (RRF + MMR + graph legs)

0.8.2

Non-blocking HNSW index builds. ::hnsw create no longer holds the base relation's write lock during graph construction, so concurrent reads no longer stall for the build's duration. Measured: 90,507 reads completed (slowest 0.8 ms) during a ~5.6 s, 40,000-vector build that would previously have blocked them all. RocksDB only.

→ Non-blocking HNSW builds

0.8.1

One-call hybrid retrieval — hybrid_search runs HNSW + FTS (+ optional graph traversal), fuses with RRF, and optionally diversifies with MMR in a single typed call.
HNSW index build ~3× faster (20k × 128: 135 s → 43.6 s, measured release); the built graph is byte-identical.
mnestic-rocks — the C++/RocksDB bridge is now a maintained fork (importable name stays cozorocks).
Blocking clippy CI gate.

→ Hybrid retrieval (RRF + MMR)

0.8.0

First fork release: CozoDB 0.7.6 plus 30 unreleased upstream commits (the fork point), bumped to 0.8.0 to mark the fork's identity.

Equality pushdown — keyed stored_prefix_join for equality post-filters (~28–29× faster single-row lookups at 5k rows).
ReciprocalRankFusion / MaximalMarginalRelevance fixed rules (aliases RRF / MMR).
ULID functions — rand_ulid(), ulid_timestamp().
Parser fix — keyword-prefixed identifiers parse correctly (upstream #281).
env_logger moved to a dev-dependency (upstream #287).

→ Equality pushdown · ULID identifiers

Upstream CozoDB history

CozoDB's own release notes (v0.1 through v0.7) — covering the introduction of HNSW vector search, full-text search, MinHash-LSH, JSON values, and time travel — remain available in the original documentation. mnestic inherits all of that functionality.