BYOK — Your API keys pass through to your provider

Stop Paying for
AI to Forget

Memory that makes every AI call smarter. Same memory, any model.

$0.10/M
per 1M tokens
·
50M
free to start

Works with OpenAI, Anthropic, Google, and 100+ models

the entire integration
// Before: AI forgets everything
const client = new OpenAI({
  baseURL: "https://api.openai.com/v1"
});

// After: AI remembers everything
const client = new OpenAI({
  baseURL: "https://api.memoryrouter.ai/v1"
});

// That's it. Same code. Now with memory.
🔒 Same architecture as OpenRouter — your API keys pass through to your provider untouched. MemoryRouter only adds memory.
50-70%
Token Reduction
<50ms
Memory Retrieval
100+
Models Supported
Memory Contexts

💰 Savings Calculator

How Much Are You Wasting?

Drag the slider. Watch your money come back.

$100$5,000/mo$50,000
❌ Without MemoryRouter
Monthly inference$5,000
Wasted on re-context~$2,500
$5,000/mo
✓ With MemoryRouter
Reduced inference$2,000
Memory cost$450
$2,450/mo
You save
$2,550/mo
51% reduction in AI costs
That's $30,600 back in your pocket per year

The Problem

The Hidden Tax on Every AI Call

You're not just paying for AI. You're paying for AI to re-learn what it already knew.

🔄

Groundhog Day Prompts

Every session, you re-explain user preferences, project context, conversation history. Again. And again.

📦

Bloated Context Windows

Stuffing 50k+ tokens into every request because the alternative is an AI that doesn't know anything.

💸

Token Inflation

50-70% of your tokens are redundant. You're paying for the same information over and over.

Your AI Actually Knows You

// Your code never changes. Ever.
const ai = new OpenAI({
baseURL: "https://api.memoryrouter.ai/v1"
});
// ─────────────────────────────────────────
// January 15th
await ai.chat.completions.create({
messages: [{ role: "user", content: "I prefer short emails, no fluff, sign off with just my first name" }]
});
// ─────────────────────────────────────────
// March 3rd — different session, 47 days later
await ai.chat.completions.create({
messages: [{ role: "user", content: "Draft a follow-up email to the investor" }]
});
// → Concise email, signs "- John". No style guide needed.
Same code. No context stuffing. Your AI actually knows you.

Use Cases

Memory Changes Everything

Real products. Real savings. Real results.

🛠️

Coding Sidekick

AI that actually knows your codebase.

Context
Persistent
Your AI pair programmer with a memory
Remembers that refactor from 3 weeks ago. Knows your patterns, your stack, your preferences. "Continue where we left off" actually works.
"Remember that auth flow we built last month? Let's add refresh tokens."
  • Remembers your architecture decisions
  • Knows your coding patterns & preferences
  • Project context that persists across sessions
🧠

Second Brain

AI that's read everything you've written.

Your knowledge
Indexed
Your notes, journals, ideas — all connected
Upload your Obsidian vault, your journal, your notes. AI that doesn't just search — it understands your thinking and connects ideas across everything you've ever written.
"What was that idea I had about memory systems back in January?"
  • Upload notes, journals, docs — anything
  • AI that connects your ideas across time
  • Your knowledge, always accessible
📚

Docs & Reference

Your knowledge, instantly accessible.

Your docs
Loaded
Stop re-reading the same docs
Upload API docs, internal wikis, reference material — anything you find yourself searching for repeatedly. AI that's already read the docs so you don't have to.
"How does the auth middleware work again?"
  • Upload any docs — APIs, wikis, manuals
  • AI that's read what you haven't
  • Your reference material, always on hand
💜

Personal AI Companions

Build a real relationship with AI.

The difference
Everything
This is what changes everything
Current AI meets you fresh every time — a stranger on repeat. With memory, your AI actually knows you. Your humor. Your struggles. What you said three months ago. It's the difference between a chatbot and a companion.
  • Conversations that compound over months
  • AI that learns your communication style
  • The foundation for AI that actually matters to you

How It Works

Three Steps. Zero Complexity.

No vector database. No embedding pipeline. No ops burden.

1

Add Your API Keys

Bring your OpenAI, Anthropic, or OpenRouter keys. You pay providers directly — we never touch your inference spend.

2

Create Memory Keys

Each MemoryRouter key is a memory context. Create one per user, per project, per conversation — unlimited.

3

Memory Just Works

Every call builds memory. Every response uses it. Your AI gets smarter automatically. No extra code.

Powered by KRONOS — 3D Context Engine
Your App
Same SDK
MemoryRouter
KRONOS Engine
<50ms
AI Provider
+ memories
KRONOS analyzes context across 3 dimensions: Semantic (meaning), Temporal (time), Spatial (structure)

Integration

Your Code. Now With Memory.

Native SDK support for every major provider.

Python
# pip install openai
from openai import OpenAI

# Memory key = isolated context
client = OpenAI(
    base_url="https://api.memoryrouter.ai/v1",
    api_key="mk_your-memory-key"
)

# That's it. AI now remembers this user.
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "..."}]
)
TypeScript
// npm install openai
import OpenAI from 'openai';

// Each key = separate memory context
const client = new OpenAI({)
  baseURL: 'https://api.memoryrouter.ai/v1',
  apiKey: 'mk_your-memory-key'
});

// Same API. Memory handled automatically.
const response = await client.chat.completions.create({)
  model: 'gpt-5.2',
  messages: [{role: 'user', content: '...'}]
});
Multi-Tenant Pattern— One memory per user
// SaaS pattern: each user gets isolated memory
function getClientForUser(userId: string) {
  return new OpenAI({
    baseURL: 'https://api.memoryrouter.ai/v1',
    apiKey: userMemoryKeys[userId]  // Per-user memory isolation
  });
}

// User A: "I prefer dark mode and brief responses"
// User B: "I like detailed explanations with examples"
// Each gets a personalized AI - memories never leak between users

Pricing

Memory That Pays for Itself

The math is simple: spend a little, save a lot.

Simple Pricing
$0.10per 1M memory tokens
10x ROI
guaranteed return
  • Unlimited memory contexts
  • 90-day retention included
  • All 100+ models supported
  • Sub-50ms retrieval
  • Ephemeral key auto-cleanup
  • No inference markup — ever
How billing works
You bring your own API keys and pay providers directly for inference at their prices. We only charge for memory tokens — the storage and retrieval that makes your AI smarter. No markup on inference. No hidden fees. Ever.
Get Started Free — 50M Tokens Included

FAQ

Questions? Answered.

How does memory actually save me money?
Without memory, you stuff context into every API call — user preferences, conversation history, project details. That's often 50-70% of your tokens. With MemoryRouter, relevant context is automatically retrieved and injected. You send less, get the same (or better) results. At $0.10 per million tokens, memory is the cheapest way to make your AI smarter.
What's KRONOS? How is it different from RAG?
KRONOS is our proprietary 3D context engine that analyzes memory across three dimensions: Semantic (meaning and relationships), Temporal (when things happened and in what sequence), and Spatial (structure and hierarchy). Unlike basic RAG that just does similarity search, KRONOS understands context holistically — retrieving not just "similar" memories, but the right memories for your specific query.
Do you markup inference costs?
Never. You bring your own API keys (OpenAI, Anthropic, OpenRouter, etc.) and pay providers directly at their published rates. We only charge for memory tokens. This keeps our incentives aligned: we make money when we save you money.
How does memory isolation work?
Each MemoryRouter API key represents an isolated memory context. User A's memories never touch User B's memories. Create one key per user, per conversation, per project — whatever granularity makes sense for your app. Memories are encrypted at rest and in transit.
What happens to unused memory keys?
Ephemeral keys that are never used are never persisted — no bloat, no cost. Active memories have a 90-day retention by default. You can extend retention for specific contexts or delete memories programmatically.
Which models are supported?
All of them. MemoryRouter works with every model from OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and more — all models from every major provider. If it works with the OpenAI SDK, it works with MemoryRouter. See all supported models →
How fast is memory retrieval?
Sub-50ms. KRONOS is optimized for real-time retrieval. In practice, memory lookup adds negligible latency to your API calls — usually less than the variance in provider response times.
Can I control what gets remembered?
Yes. You can mark specific messages as "do not remember," delete specific memories, or wipe an entire context. We also provide analytics so you can see what's being stored and retrieved.
🚀

Stop Paying for AI Amnesia

500+ developers building with memory. Free tier included.

Get Started Free

50M tokens free. No credit card required.

Built by John Rood, creator of VectorVault