One signed binary. Every feature compiled in. Free to run. Install Crowkis →

Features

Everything Crowkis does, on one page.

A semantic cache that understands meaning, long-term memory for your agents, guardrails that say no, and the Rust engine that makes it all sub-millisecond — every capability, grouped and linked.

The ones marked ▲ Trending are what the AI market is buying right now — agent memory, MCP, RAG, guardrails, evals, and gateways. Crowkis ships them all in one self-hosted, zero-egress binary.

Explore Agent Memory Install Crowkis

28+ features, one binary4 protocols — RESP · gRPC · REST · MCP0 external API calls$0 to run Community

one engine · every feature

The core

A cache that understands meaning

Seven systems decide what is safe to reuse — semantic and structural together, gated by confidence, freshness, and trust.

1.01▲ Trending

Semantic + structural matching

Vector similarity and intent/template matching together — paraphrases hit, but a wrong number or entity never does.

Read the deep-dive→

Confidence scoring

Every hit returns a 5-signal geometric-mean score so you gate reuse on a number, not faith.

Read the deep-dive→

Adaptive thresholds

Per-intent base bars + complexity adjustment + an EMA feedback loop that learns and persists.

Read the deep-dive→

Anti-poisoning pipeline

Five stages score every write before it can be served — coherence, content, trust, isolation, neighbourhood.

Read the deep-dive→

Smart eviction

Composite retention by recency, frequency, isolation, and compute cost — keeps the answers that are expensive to rebuild.

Read the deep-dive→

Freshness control

Per-intent TTL policies, version pinning, and webhook invalidation, with freshness decay inside confidence.

Read the deep-dive→

For agents

Memory and reuse for agentic systems

The features that make agents remember, share work, and stop paying twice.

2.01▲ Trending

Agent memory

Long-term, consolidating, bi-temporal memory scoped to (agent, user) — 70.4% recall@10 on LoCoMo.

Read the deep-dive→

2.02▲ Trending

Reasoning reuse

Cache the chain-of-thought as a step graph and replay it for the next query at ~15% of the token cost.

Read the deep-dive→

Sessions

Multi-turn conversation buffers with both recent-window reads and semantic search across the whole chat.

Read the deep-dive→

Tool-result cache

Cache a deterministic tool call keyed by tool + exact args, so a swarm's duplicate lookups become one.

Read the deep-dive→

2.05▲ Trending

MCP for AI apps

Let Claude Code and agents use the cache as a tool over MCP — one config block, same trust pipeline.

Read the deep-dive→

2.06▲ Trending

Multimodal cache

Cache image-plus-text lookups, so a repeated vision question is a hit instead of an expensive re-run.

Read the deep-dive→

Safety & guardrails

The features that say no

Input and output gates, evals, and human-approved answers — all local, all zero-egress.

3.01▲ Trending

Input guardrails (CGUARD)

Prompt-injection and jailbreak scanning that normalizes leetspeak and zero-width evasion first.

Read the deep-dive→

Output guardrails (COUTCHECK)

PII-leak, toxicity, and JSON-validity scanning on the response before it ships.

Read the deep-dive→

3.03▲ Trending

Online evals (CEVAL)

Nine deterministic evaluators that grade output without a second model — tracked over time on /metrics.

Read the deep-dive→

Pinned answers

Serve a human-approved answer verbatim for the questions where 'close enough' is unacceptable.

Read the deep-dive→

Negative cache

Flag a wrong answer once; every paraphrase of the question that would reproduce it is caught.

Read the deep-dive→

PII scrub & erasure

Report what personal data is cached and execute right-to-erasure on request.

Read the deep-dive→

Build & operate

Everything around the cache

Gateway, RAG, prompt ops, budgets, and the observability to run it all.

4.01▲ Trending

AI Gateway

An OpenAI-compatible proxy — point your client at Crowkis and get semantic caching, retries, and routing.

Read the deep-dive→

4.02▲ Trending

Self-hosted RAG (CDOC)

Auto-chunking, metadata filtering, and reranking inside the cache — no separate vector database.

Read the deep-dive→

Prompt versioning & A/B

Named templates with versioning, variable rendering, sticky per-user splits, and rollback.

Read the deep-dive→

Budgets & rate limits

Per-tenant spend visibility and requests/tokens-per-minute ceilings, enforced before the invoice.

Read the deep-dive→

Local embeddings (CEMBED)

Free, cached, no-API-key embeddings from the bundled ONNX model — the foundation everything else stands on.

Read the deep-dive→

Observability

Live dashboard, CINFO, and Prometheus /metrics — hit rate, saved spend, safety blocks, all in the box.

Read the deep-dive→

The platform

What it's all built on

Drop-in adoption, a hand-built Rust engine, and one signed image.

Redis-compatible (RESP3)

Existing Redis clients connect unmodified — adoption is a port change, not a rewrite.

Read the deep-dive→

Built in Rust

A custom LSM engine and in-process vector index, no GC in the read path — sub-millisecond hits by design.

Read the deep-dive→

One signed image

Every feature compiled in; a license file flips Community to Enterprise at boot. No supply chain to attack.

Read the deep-dive→

Four protocols

RESP, gRPC, REST, and MCP front the same engine — reach the cache however your stack prefers.

Read the deep-dive→

hover · click to ripple

One surface, many features

Every feature is a cell in the same grid

Cache, memory, guardrails, gateway — they aren't bolt-ons stitched across services. They're facets of one Redis-compatible engine, sharing the same store, the same embedder, the same trust pipeline. Touch one and the whole thing responds.

One signed image. Every feature. Free to run.

Community edition ships at full power. A license file flips it to Enterprise at boot.

Get started Enterprise