Skip to main content
Background Image

ConvoCache

Manoj
Author
Manoj
ML Engineer @ 7-Eleven
Table of Contents

Pitch
#

ConvoCache gives AI assistants a persistent memory layer based on what the model actually attended to.

Instead of summarizing everything or relying only on vector retrieval, it tracks which past turns influenced later responses and retains those turns’ KV state longer.

Memory Policy
#

flowchart TD
  Turns[Conversation turns] --> Attention[Observed attention weights]
  Attention --> Score[Retention score]
  Score --> Hot[Hot memory]
  Score --> Warm[Warm memory]
  Score --> Evict[Evict or summarize]
  Hot --> Rehydrate[Future session rehydration]
  Warm --> Rehydrate

Customer
#

AI assistant builders, CRM copilots, sales assistants, and customer-support systems.

Differentiation
#

Most memory systems store what was mentioned. ConvoCache stores what the model used.

Risks
#

  • KV persistence across model versions is hard.
  • Storage costs can grow quickly.
  • Attention is not always a faithful explanation of importance.