Show HN: SNKV – SQLite's B-tree as a key-value store (C/C++ and Python bindings)

2026-02-2412:593631github.com

SNKV- key value store using sqlite b-tree APIs. Contribute to hash-anu/snkv development by creating an account on GitHub.

Build Memory Leaks Tests Peak Memory GitHub Issues GitHub Closed Issues License

SNKV is a lightweight, ACID-compliant embedded key-value store built directly on SQLite's B-Tree storage engine — without SQL.

The idea: bypass the SQL layer entirely and talk directly to SQLite's storage engine. No SQL parser. No query planner. No virtual machine. Just a clean KV API on top of a proven, battle-tested storage core.

SQLite-grade reliability. KV-first design. Lower overhead for read-heavy and mixed key-value workloads.

Single-header integration — drop it in and go:

#define SNKV_IMPLEMENTATION
#include "snkv.h" int main(void) { KVStore *db; kvstore_open("mydb.db", &db, KVSTORE_JOURNAL_WAL); kvstore_put(db, "key", 3, "value", 5); void *val; int len; kvstore_get(db, "key", 3, &val, &len); printf("%.*s\n", len, (char*)val); snkv_free(val); kvstore_close(db);
}

Use kvstore_open_v2 to control how the store is opened. Zero-initialise the config and set only what you need — unset fields resolve to safe defaults.

KVStoreConfig cfg = {0};
cfg.journalMode = KVSTORE_JOURNAL_WAL; /* WAL mode (default) */
cfg.syncLevel = KVSTORE_SYNC_NORMAL; /* survives process crash (default) */
cfg.cacheSize = 4000; /* ~16 MB page cache (default 2000 ≈ 8 MB) */
cfg.pageSize = 4096; /* DB page size, new DBs only (default 4096) */
cfg.busyTimeout = 5000; /* retry 5 s on SQLITE_BUSY (default 0) */
cfg.readOnly = 0; /* read-write (default) */ KVStore *db;
kvstore_open_v2("mydb.db", &db, &cfg);
Field Default Options
journalMode KVSTORE_JOURNAL_WAL KVSTORE_JOURNAL_DELETE
syncLevel KVSTORE_SYNC_NORMAL KVSTORE_SYNC_OFF, KVSTORE_SYNC_FULL
cacheSize 2000 pages (~8 MB) Any positive integer
pageSize 4096 bytes Power of 2, 512–65536; new DBs only
readOnly 0 1 to open read-only
busyTimeout 0 (fail immediately) Milliseconds; useful for multi-process use

kvstore_open remains fully supported and uses all defaults except journalMode.

make # builds libsnkv.a
make snkv.h # generates single-header version
make examples # builds examples
make run-examples # run all examples
make test # run all tests (CI suite)
make clean

1. Install MSYS2.

2. Launch "MSYS2 MinGW 64-bit" from the Start menu (not the plain MSYS2 terminal).

3. Install the toolchain:

pacman -S --needed mingw-w64-x86_64-gcc make

4. Clone and build:

git clone https://github.com/hash-anu/snkv.git
cd snkv
make # builds libsnkv.a
make snkv.h # generates single-header
make examples # builds .exe examples
make run-examples
make test

All commands must be run from the MSYS2 MinGW64 shell. Running mingw32-make from a native cmd.exe or PowerShell window will not work — the Makefile relies on sh and standard Unix tools that are only available inside the MSYS2 environment.

Available on PyPI — no compiler needed:

from snkv import KVStore with KVStore("mydb.db") as db: db["hello"] = "world" print(db["hello"].decode()) # world

Full documentation — installation, API reference, examples, and thread-safety notes — is in python/README.md.

SNKV Python API Demo

A production-scale kill-9 test is included but kept separate from the CI suite. It writes unique deterministic key-value pairs into a 10 GB WAL-mode database, forcibly kills the writer with SIGKILL during active writes, and verifies on restart that every committed transaction is present with byte-exact values, no partial transactions are visible, and the database has zero corruption.

make test-crash-10gb # run full 5-cycle kill-9 + verify (Linux / macOS) # individual modes
./tests/test_crash_10gb write tests/crash_10gb.db # continuous writer
./tests/test_crash_10gb verify tests/crash_10gb.db # post-crash verifier
./tests/test_crash_10gb clean tests/crash_10gb.db # remove DB files

Requires ~11 GB free disk. run mode is POSIX-only; write and verify work on all platforms.

Standard database path:

Application → SQL Parser → Query Planner → VDBE (VM) → B-Tree → Disk

SNKV path:

Application → KV API → B-Tree → Disk

By removing the layers you don't need for key-value workloads, SNKV keeps the proven storage core and cuts the overhead.

Layer SQLite SNKV
SQL Parser
Query Planner
VDBE (VM)
B-Tree Engine
Pager / WAL

1M records, Linux, averaged across 3 runs. Both SNKV and SQLite use identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) page cache, 4096-byte pages.

Benchmark source: SNKV · SQLite

SQLite benchmark uses WITHOUT ROWID with a BLOB primary key — the fairest possible comparison, both using a single B-tree keyed on the same field. Both run with identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) cache, 4096-byte pages. This isolates the pure cost of the SQL layer for KV operations.

Note: Both SNKV and SQLite (WITHOUT ROWID) use identical peak RSS (~10.8 MB) since they share the same underlying pager and page cache infrastructure.

Benchmark SQLite SNKV Notes
Sequential writes 140K ops/s 146K ops/s SNKV 1.05x faster
Random reads 87K ops/s 139K ops/s SNKV 1.6x faster
Sequential scan 1.61M ops/s 3.16M ops/s SNKV 2x faster
Random updates 17K ops/s 24K ops/s SNKV 1.4x faster
Random deletes 17K ops/s 20K ops/s SNKV 1.2x faster
Exists checks 87K ops/s 149K ops/s SNKV 1.7x faster
Mixed workload 35K ops/s 50K ops/s SNKV 1.4x faster
Bulk insert 211K ops/s 240K ops/s SNKV 1.1x faster

With identical storage configuration, SNKV wins across every benchmark. The gains come from two sources: bypassing the SQL layer (no parsing, no query planner, no VDBE) and a per-column-family cached read cursor that eliminates repeated cursor open/close overhead on the hot read path. The biggest wins are on read-heavy operations — random reads (+60%), exists checks (+70%), and sequential scan (+100%) — exactly where the cursor caching pays off most.

If you want to benchmark SNKV against LMDB or RocksDB, the benchmark harnesses are here:

SNKV is a good fit if:

  • Your workload is read-heavy or mixed (reads + writes)
  • You're running in a memory-constrained or embedded environment
  • You want a clean KV API without writing SQL strings, preparing statements, and binding parameters
  • You need single-header C integration with no external dependencies
  • You want predictable latency — no compaction stalls, no mmap tuning

Consider alternatives if:

  • You need maximum write/update/delete throughput → RocksDB (LSM-tree)
  • You need maximum read/scan speed and memory isn't a constraint → LMDB (memory-mapped)
  • You already use SQL elsewhere and want to consolidate → SQLite directly
  • ACID Transactions — commit / rollback safety

  • WAL Mode — concurrent readers + single writer

  • Column Families — logical namespaces within a single database

  • Iterators — ordered key traversal

  • Thread Safe — built-in synchronization

  • Single-header — drop snkv.h into any C/C++ project

  • Zero memory leaks — verified with Valgrind

  • SSD-friendly — WAL appends sequentially, reducing random writes

  • Python Bindings — idiomatic Python 3.8+ API with dict-style access, context managers, typed exceptions, and prefix iterators — see python/README.md

Because SNKV uses SQLite's file format and pager layer, backup tools that operate at the WAL or page level work out of the box:

  • LiteFS — distributed SQLite replication works with SNKV databases
  • SQLite Online Backup API — operates at the page level, fully compatible
  • WAL-based backup tools — any tool consuming WAL files works correctly
  • Rollback journal tools — journal mode is fully supported

Note: Tools that rely on SQLite's schema layer — like the sqlite3 CLI or DB Browser for SQLite — won't work. SNKV bypasses the schema layer entirely by design.

I documented the SQLite internals explored while building this:

  • Minimalism wins — fewer layers, less overhead
  • Proven foundations — reuse battle-tested storage, don't reinvent it
  • Predictable performance — no hidden query costs, no compaction stalls
  • Honest tradeoffs — SNKV is not the fastest at everything; it's optimized for its target use case

Apache License 2.0 © 2025 Hash Anu


Read the original article

Comments

  • By franticgecko3 2026-02-2413:292 reply

    OP seems to self promote this project and other similar vibe coded works every few weeks under two different HN handles.

    Edit: for me this post appears on the front page of HN. OP this is mission success - add this project to your résumé and stop spamming.

    • By gjgtcbkj 2026-02-2413:41

      Yeah I wish substack would stop doing this too. They keep inserting their brand in HN under different handles.

  • By ThrowawayR2 2026-02-2415:04

    Nine reposts across the last 21 days across a couple of fresh accounts, some of which seem to be banned(?).

    - Show HN: SNKV – KV store on SQLite's B-tree with 11x less memory than RocksDB (github.com/hash-anu) 3 points by swaminarayan 6 days ago

    - I read 150K lines of SQLite source — here’s how its B-Tree powers a KV store (github.com/hash-anu) 1 point by hashmakjsn 6 days ago | 1 comment

    - Show HN: SnkvDB – Single-header ACID KV store using SQLite's B-Tree engine (github.com/hash-anu) 5 points by hashmakjsn 8 days ago | 1 comment

    - Show HN: SNKV benchmark with RocksDB (github.com/hash-anu) 2 points by hashmakjsn 13 days ago

    - Show HN: SNKV and LiteFS – Distributed KV store with automatic replication (github.com/hash-anu) 3 points by hashmakjsn 14 days ago

    - Show HN: In SQLite v3.51.2 skipped query layers and accessed b-tree APIs for KV 1 point by hashmak_jsn 15 days ago

    - Show HN: Developed key value storage using SQLite's b-tree APIs directly (github.com/hash-anu) 4 points by hashmakjsn 15 days ago

    - Show HN: Developed key value storage using SQLite b-tree APIs directly (github.com/hash-anu) 1 point by hashmak_jsn 20 days ago

    - Show HN: SNKV — A Key-Value Store Built Directly on SQLite’s B-Tree APIs (github.com/hash-anu) 1 point by hashmak_jsn 21 days ago

  • By Retr0id 2026-02-2413:113 reply

    I'm surprised by your benchmark results.

    I've considered building this exact thing before (I think I've talked about it on HN even), but the reason I didn't build it was because I was sure (on an intuitive level) the actual overhead of the SQL layer was negligible for simple k/v queries.

    Where does the +104% on random deletes (for example) actually come from?

    • By swaminarayan 2026-02-2413:153 reply

      Fair skepticism — I had the same intuition going in.

      The SQL layer overhead alone is probably small, you're right. The bigger gain comes from a cached read cursor. SQLite opens and closes a cursor on every operation. SNKV keeps one persistent cursor per column family sitting open on the B-tree. On random deletes that means seek + delete on an already warm cursor vs. initialize cursor + seek + delete + close on every call.

      For deletes there's also prepared statement overhead in SQLite — even with prepare/bind/step/reset, that's extra work SNKV just doesn't do.

      I'd genuinely like someone else to run the numbers. Benchmark source is in the repo if you want to poke at it — tests/test_benchmark.c on the SNKV side and https://github.com/hash-anu/sqllite-benchmark-kv for SQLite. If your results differ I want to know.

      • By Retr0id 2026-02-2413:221 reply

        What does "column family" mean in this context?

        • By swaminarayan 2026-02-2413:231 reply

          A named key space within the same database file — keys in "users" don't collide with keys in "sessions" but both share the same WAL and transaction.

          • By bflesch 2026-02-2413:271 reply

            Did you measure the performance impact of having multiple trees in a single file vs. having one tree per file? I'd assume one per file is faster, is that correct?

            • By swaminarayan 2026-02-2414:11

              no dont know about it. I will check it out.

      • By d1l 2026-02-2413:231 reply

        Are you using ai for the comment replies too?!

        • By derwiki 2026-02-2413:32

          Everyone knows the emdash is a giveaway, and they are being left in

      • By altmanaltman 2026-02-2413:242 reply

        are you reading what you're writing?

        • By d1l 2026-02-2413:26

          It's a nonstop slop funnel as far as I can tell. Only ashamed I've been here for more than 5 minutes.

        • By swaminarayan 2026-02-2414:12

          yes

    • By nightfly 2026-02-2413:15

      It "only" doubles performance so the overheads aren't that heavy

    • By snowhale 2026-02-2414:021 reply

      [dead]

HackerNews