v0.1.5 — Official OpenClaw Plugin

Total recall.
90% fewer tokens.

Your agent forgets past decisions and burns tokens re-reading the same context. Memory Stack runs 5 search engines locally, returns only what matters, and never loses a fact. Bring your own API key to unlock LLM-powered fact extraction for the complete experience.

openclaw-memory
# Install — paste this, everything else is automatic
$ curl -fsSL .../api/install.sh | bash -s -- --key=oc-starter-xxx
[OK] License verified
[OK] Registered as OpenClaw memory plugin
[OK] OpenClaw gateway restarting
 
# Done. Just talk — memory works in the background
You: What did we decide about the API design?
[memory] 5 engines searched → fused → T1 tier (~98 tokens)
Agent: We decided to use REST with /api/v2/...
 
# Updates are automatic — no action needed
[gateway] Memory Stack auto-updated to v0.1.5
// why memory stack

Less tokens. More memory. Faster answers.

Every wasted token is money burned. Memory Stack eliminates the waste.

💡

3-Tier Token Control

Native memory dumps full text every time. Memory Stack gives you three tiers: L0 auto-recall at ~100 tokens, L1 summaries at ~800, L2 full content on demand. 90% fewer tokens per search — your agent gets exactly what it needs, nothing more.

🔍

5-Engine Search Fusion

One search fires 5 engines in parallel — full-text, semantic, markdown, fact store, and compressed history. Results merge with rank fusion and diversity reranking. Right answer on the first try. No wasted tokens chasing wrong context.

🛡

Never Forget a Decision

When conversations get long, compression eats old messages. Memory Stack extracts key facts — decisions, deadlines, requirements — into a dedicated store before they disappear. Your agent recalls them instantly instead of you re-explaining. Zero wasted tokens.

🔗

Entity Tracking

Flat text search forces your agent to re-read everything to find connections. Memory Stack automatically tracks entities and their relationships — who changed what, what depends on what, how things evolved. Queryable on demand, not buried in old conversations.

🧹

Self-Cleaning Memory

Duplicates and junk cost real money every time your agent reads them. 4-level deduplication (exact, normalized, substring, cosine) runs automatically. Health score 0-100 shows exactly what's wasting tokens. Your memory stays lean, your bill stays flat.

Bring Your Key. Unlock Full Power.

Core search runs offline out of the box. Add any LLM API key — OpenAI, Anthropic, Ollama, MLX, or any OpenAI-compatible endpoint — and Memory Stack auto-detects it. LLM-powered fact extraction kicks in: every conversation produces structured decisions, deadlines, and entities stored permanently. A few cents per session saves dollars of wasted tokens. Your key, your choice of provider.

// how it works

Paste one command. Everything else is automatic.

One curl command installs, registers, and restarts OpenClaw. Updates happen automatically in the background.

You talk to OpenClaw
auto-recall kicks in
Relevant memories found
injected before response
Agent responds
key facts extracted and saved
Conversation gets long
compaction happens
You ask about old decisions
fact store has them
// vs native memory

Same agent. Fewer tokens. Better recall.

Same conversations. One remembers everything at 10% of the token cost.

5
search engines
vs native's 2
3
output tiers
L0 ~100 / L1 ~800 / L2 full
90%
fewer tokens per search
lower API bill
Native Memory Memory Stack
Remembers longer
What happens when the conversation gets too long? Old messages get compressed. Decisions disappear. Key facts are saved before compression and brought back automatically.
Can it remember things from last week? Only if it's still in the conversation window. Yes. Recent memories rank higher, but older ones are still searchable.
Does it understand how things connect? No. It searches text, not relationships. Yes. Entity tracking links people, tools, and decisions — queryable on demand.
Can it trace how a decision evolved? No. Yes. Evolution chains link past decisions to current ones.
Saves you money
How much does each memory search cost? Loads full text every time. More tokens, higher API bill. Loads a summary first. Only fetches full text when needed. Uses up to 90% fewer tokens.
Does it waste money on duplicate results? Can feed the same info to your AI twice. You pay for both. Removes duplicates before sending anything to the AI. You only pay once.
Does the cost grow over time? Memory piles up. More junk = more tokens = higher cost. Auto-cleanup merges similar memories. Stays lean, cost stays flat.
Finds things faster
How many search methods run per query? 2 (keyword + vector) 5 engines, merged with rank fusion and per-engine weights.
Does it understand what you meant, not just what you typed? Basic keyword matching. Query expansion rewrites your question locally before searching — no API call needed.
Can it search across past conversations? Limited. Dedicated fact store and entity tracking — finds facts across all conversations instantly.
Can you check if your memory is healthy? No. Quality score 0-100. Shows duplicates, stale entries, noise.
How much context does each recall use? Full text every time. No control over token usage. Tiered output: L0 auto-recall uses ~100 tokens. L1 summaries ~800. Full text only on demand.
Does it need API keys or cloud services? Vector search needs an embedding provider. Core search runs offline. Bring your own LLM key (OpenAI, Anthropic, Ollama, MLX — any provider) to unlock structured fact extraction from every conversation. Full experience with your key.
// vs other memory skills

One install replaces 4 skills.

Most memory skills do one thing. You end up installing 3-4 and hoping they work together.

What you need Other skills Memory Stack
Find a function name Vector search misses exact names Full-text keyword search finds it instantly
Find "how does auth work" Vector search works Semantic search with query expansion
Search across 5 conversations Limited to current context Fact store + entity tracking
Control token spend Full text every time 3 tiers: ~100 / ~800 / full
Remove duplicates Manual cleanup 4-level auto-dedup
Track decision evolution No history Evolution tracking across conversations
Check memory quality No tooling Health score 0-100
Work offline Needs OpenAI key Core search runs offline
$49
One-time. No subscription. Pays for itself in the first week of saved API costs.
// frequently asked

Questions OpenClaw users ask.

What is OpenClaw Memory Stack?

A drop-in OpenClaw plugin that replaces built-in memory. 5 search engines with rank fusion, entity tracking, and 3-tier token control. Your agent recalls more while using up to 90% fewer tokens. Core search and memory run locally — add your own LLM key for enhanced fact extraction.

How does it improve OpenClaw's memory?

5 engines fire in parallel with automatic fallback. Results merge with rank fusion and diversity reranking. Entities and relationships are tracked automatically and queryable on demand. Tiered output controls exactly how many tokens each recall costs. Add your own LLM key (OpenAI, Anthropic, Ollama, MLX, or any compatible provider) for structured fact extraction — the complete Memory Stack experience.

Does it work with OpenClaw's Telegram integration?

Yes. Memory Stack plugs into OpenClaw as a native memory provider. It works with Telegram, CLI, and any other OpenClaw channel. No extra configuration needed — one command and it's live.

Does it need an internet connection?

Core search, rank fusion, deduplication, and entity tracking all run locally. No data leaves your machine. For enhanced fact extraction, add your own LLM key — supports OpenAI, Anthropic, Ollama, MLX, and any compatible endpoint. Auto-detected at startup. Without a key, core search still works fully offline. Update checks run in the background and fail silently.

How does it save money on API costs?

Every token your AI reads costs money. Memory Stack cuts that three ways: (1) Tiered output — auto-recall uses ~100 tokens, on-demand search uses ~800 tokens, full text only loads when requested. Up to 90% fewer tokens per search. (2) Duplicate removal so you don't pay for the same information twice. (3) Compressed history — your agent drills down only when it needs detail.

What happens when conversations get compressed?

When OpenClaw conversations get long, the system compresses old messages to fit the context window. Important decisions can get lost in this process. Memory Stack extracts key facts (decisions, deadlines, architecture choices) into a dedicated store before compression happens, and retrieves them instantly when relevant — zero wasted tokens re-explaining things.

Is it a subscription?

No. One-time purchase of $49. You own the code. No recurring fees, no SaaS, no data collection. Just files that live on your machine.

Do I get updates?

Yes. Memory Stack checks for new versions automatically when it starts up. When an update is available, you'll see a one-line prompt — run the command and you're done. Bug fixes within your version are always free. No manual checking, no update subscriptions.