Everything Crowkis does, on one page.
A semantic cache that understands meaning, long-term memory for your agents, guardrails that say no, and the Rust engine that makes it all sub-millisecond — every capability, grouped and linked.
The ones marked ▲ Trending are what the AI market is buying right now — agent memory, MCP, RAG, guardrails, evals, and gateways. Crowkis ships them all in one self-hosted, zero-egress binary.
one engine · every feature
A cache that understands meaning
Seven systems decide what is safe to reuse — semantic and structural together, gated by confidence, freshness, and trust.
Semantic + structural matching
Vector similarity and intent/template matching together — paraphrases hit, but a wrong number or entity never does.
Read the deep-dive→Confidence scoring
Every hit returns a 5-signal geometric-mean score so you gate reuse on a number, not faith.
Read the deep-dive→Adaptive thresholds
Per-intent base bars + complexity adjustment + an EMA feedback loop that learns and persists.
Read the deep-dive→Anti-poisoning pipeline
Five stages score every write before it can be served — coherence, content, trust, isolation, neighbourhood.
Read the deep-dive→Smart eviction
Composite retention by recency, frequency, isolation, and compute cost — keeps the answers that are expensive to rebuild.
Read the deep-dive→Freshness control
Per-intent TTL policies, version pinning, and webhook invalidation, with freshness decay inside confidence.
Read the deep-dive→Memory and reuse for agentic systems
The features that make agents remember, share work, and stop paying twice.
Agent memory
Long-term, consolidating, bi-temporal memory scoped to (agent, user) — 70.4% recall@10 on LoCoMo.
Read the deep-dive→Reasoning reuse
Cache the chain-of-thought as a step graph and replay it for the next query at ~15% of the token cost.
Read the deep-dive→Sessions
Multi-turn conversation buffers with both recent-window reads and semantic search across the whole chat.
Read the deep-dive→Tool-result cache
Cache a deterministic tool call keyed by tool + exact args, so a swarm's duplicate lookups become one.
Read the deep-dive→MCP for AI apps
Let Claude Code and agents use the cache as a tool over MCP — one config block, same trust pipeline.
Read the deep-dive→Multimodal cache
Cache image-plus-text lookups, so a repeated vision question is a hit instead of an expensive re-run.
Read the deep-dive→The features that say no
Input and output gates, evals, and human-approved answers — all local, all zero-egress.
Input guardrails (CGUARD)
Prompt-injection and jailbreak scanning that normalizes leetspeak and zero-width evasion first.
Read the deep-dive→Output guardrails (COUTCHECK)
PII-leak, toxicity, and JSON-validity scanning on the response before it ships.
Read the deep-dive→Online evals (CEVAL)
Nine deterministic evaluators that grade output without a second model — tracked over time on /metrics.
Read the deep-dive→Pinned answers
Serve a human-approved answer verbatim for the questions where 'close enough' is unacceptable.
Read the deep-dive→Negative cache
Flag a wrong answer once; every paraphrase of the question that would reproduce it is caught.
Read the deep-dive→PII scrub & erasure
Report what personal data is cached and execute right-to-erasure on request.
Read the deep-dive→Everything around the cache
Gateway, RAG, prompt ops, budgets, and the observability to run it all.
AI Gateway
An OpenAI-compatible proxy — point your client at Crowkis and get semantic caching, retries, and routing.
Read the deep-dive→Self-hosted RAG (CDOC)
Auto-chunking, metadata filtering, and reranking inside the cache — no separate vector database.
Read the deep-dive→Prompt versioning & A/B
Named templates with versioning, variable rendering, sticky per-user splits, and rollback.
Read the deep-dive→Budgets & rate limits
Per-tenant spend visibility and requests/tokens-per-minute ceilings, enforced before the invoice.
Read the deep-dive→Local embeddings (CEMBED)
Free, cached, no-API-key embeddings from the bundled ONNX model — the foundation everything else stands on.
Read the deep-dive→Observability
Live dashboard, CINFO, and Prometheus /metrics — hit rate, saved spend, safety blocks, all in the box.
Read the deep-dive→What it's all built on
Drop-in adoption, a hand-built Rust engine, and one signed image.
Redis-compatible (RESP3)
Existing Redis clients connect unmodified — adoption is a port change, not a rewrite.
Read the deep-dive→Built in Rust
A custom LSM engine and in-process vector index, no GC in the read path — sub-millisecond hits by design.
Read the deep-dive→One signed image
Every feature compiled in; a license file flips Community to Enterprise at boot. No supply chain to attack.
Read the deep-dive→Four protocols
RESP, gRPC, REST, and MCP front the same engine — reach the cache however your stack prefers.
Read the deep-dive→hover · click to ripple
Every feature is a cell in the same grid
Cache, memory, guardrails, gateway — they aren't bolt-ons stitched across services. They're facets of one Redis-compatible engine, sharing the same store, the same embedder, the same trust pipeline. Touch one and the whole thing responds.
One signed image. Every feature. Free to run.
Community edition ships at full power. A license file flips it to Enterprise at boot.