Name	Name	Last commit message	Last commit date
parent directory ..
public	public
src	src
..example.env.local	..example.env.local
.gitignore	.gitignore
README.md	README.md
eslint.config.mjs	eslint.config.mjs
next.config.ts	next.config.ts
package-lock.json	package-lock.json
package.json	package.json
postcss.config.mjs	postcss.config.mjs
tsconfig.json	tsconfig.json

Voice Memory Demo

A Next.js demo showcasing persistent memory for voice AI agents using Gemini Live and MongoDB.

Based on the article: Building Persistent Memory for Voice AI Agents with MongoDB

Features

🎙️ Real-time Voice Interaction - WebSocket-based voice communication with Gemini Live
🧠 Persistent Memory - Store and retrieve memories across sessions using MongoDB
🔧 Memory as Tool - AI decides when to store/retrieve information (not hardcoded rules)
🔒 User Isolation - Each browser gets a unique ID for privacy
🌐 Global vs Private - Gemini classifies memories and obfuscates PII in shared data

Architecture

┌─────────────────────┐     ┌──────────────────────┐
│   Browser           │     │   Gemini Live API    │
│   - Mic capture     │◄───►│   WebSocket          │
│   - Audio playback  │     │   gemini-live-2.5-   │
│   - UI              │     │   flash-native-audio │
└─────────────────────┘     └──────────────────────┘
         │
         │ Tool Calls
         ▼
┌─────────────────────┐     ┌──────────────────────┐
│   Next.js API       │────►│   MongoDB Atlas      │
│   /api/memory       │     │   memories collection│
└─────────────────────┘     └──────────────────────┘

Quick Start

1. Clone and Install

cd voice-memory-demo
npm install

2. Set Up Environment Variables

cp .env.local.example .env.local

Edit .env.local with your credentials:

GOOGLE_API_KEY=your_gemini_api_key
MONGODB_URI=mongodb+srv://...
MONGODB_DB=voice_memory_demo
VOYAGE_AI_API_KEY=your_voyage_ai_key

3. Run the Development Server

npm run dev

Open http://localhost:3000 in your browser.

4. Create MongoDB Indexes

For hybrid search (vector + text) with $rankFusion (require latest Atlas version cluster), the code creates on startup two Atlas Search indexes on the memories collection:

Vector Search Index

Index Name: memory_vector_index

Optional: In Atlas UI: Search Indexes → Create Search Index → Atlas Vector Search

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1024,
      "similarity": "cosine"
    },
    {
      "type": "filter",
      "path": "deploymentId"
    },
    {
      "type": "filter",
      "path": "userCookie"
    },
    {
      "type": "filter",
      "path": "isGlobal"
    }
  ]
}

Atlas Search Index (Text)

Index Name: memory_text_index

Optional: In Atlas UI: Search Indexes → Create Search Index → Atlas Search

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "key": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "value": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "deploymentId": {
        "type": "string"
      },
      "userCookie": {
        "type": "string"
      },
      "isGlobal": {
        "type": "boolean"
      }
    }
  }
}

Note: If you don't set up these indexes, the app will fall back to regex-based search which is less accurate for semantic queries.

How It Works

Memory Tool

The AI agent has access to an agentMemory tool with four operations:

Operation	Description	Example
`set`	Store a key-value pair	`{op: "set", key: "user_name", value: "Pavel"}`
`get`	Retrieve by key	`{op: "get", key: "user_name"}`
`delete`	Remove a memory	`{op: "delete", key: "user_name"}`
`query`	Search memories	`{op: "query", query: "user preferences"}`

Memory Classification

When storing a memory, Gemini classifies it:

Private: User-specific data (name, preferences, contact info)
Global: Shared facts (business hours, product info, policies)

Global memories have PII obfuscated (emails → [EMAIL], phones → [PHONE]).

User Isolation

Each browser generates a UUID stored in localStorage:

Private memories are scoped to this ID
Global memories are accessible to everyone
Users can reset their identity from the UI

File Structure

voice-memory-demo/
├── src/
│   ├── app/
│   │   ├── api/memory/route.ts   # Memory API endpoint
│   │   ├── page.tsx              # Main page
│   │   ├── layout.tsx            # App layout
│   │   └── globals.css           # Styles
│   ├── components/
│   │   ├── VoiceAgent.tsx        # Main voice interface
│   │   └── MemoryPanel.tsx       # Memory debug panel
│   ├── hooks/
│   │   ├── useGeminiLive.ts      # Gemini WebSocket hook
│   │   └── useUserCookie.ts      # User ID management
│   └── lib/
│       ├── mongodb.ts            # MongoDB connection
│       ├── memory-service.ts     # Memory CRUD operations
│       └── gemini-tools.ts       # Tool definitions
├── public/
│   └── audio-processor.js        # AudioWorklet for mic capture
└── .env.local.example            # Environment template

Try These Prompts

After connecting, try saying:

"My name is [your name]"
"I live in [city]"
"I prefer email over phone calls"
"What do you remember about me?"
"What's my name?"
"Forget my name"

Technical Details

Audio Format

Input: PCM 16-bit, 16kHz, mono
Output: PCM 16-bit, 24kHz, mono

Model

Using gemini-2.5-flash-native-audio-preview-12-2025 for real-time voice-to-voice interaction with tool calling support.

MongoDB Schema

{
  deploymentId: "voice-memory-demo",
  key: "user_name",
  value: "Pavel",
  userCookie: "uuid-xxx" | "global",
  isGlobal: false,
  embedding: [0.123, -0.456, ...],  // 1024-dim VoyageAI vector (if enabled)
  createdAt: ISODate(),
  updatedAt: ISODate()
}

Troubleshooting

"WebSocket connection error"

Check your GOOGLE_API_KEY is valid
Ensure you have access to the Gemini Live API

No audio playback

Check browser permissions for audio
Try clicking the page first (browsers require user interaction)

Memory not saving

Verify MONGODB_URI is correct
Check MongoDB network access (IP whitelist)

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Voice Memory Demo

Features

Architecture

Quick Start

1. Clone and Install

2. Set Up Environment Variables

3. Run the Development Server

4. Create MongoDB Indexes

Vector Search Index

Atlas Search Index (Text)

How It Works

Memory Tool

Memory Classification

User Isolation

File Structure

Try These Prompts

Technical Details

Audio Format

Model

MongoDB Schema

Troubleshooting

"WebSocket connection error"

No audio playback

Memory not saving

License

FilesExpand file tree

voice-memory-demo

Directory actions

More options

Directory actions

More options

Latest commit

History

voice-memory-demo

Folders and files

parent directory

README.md

Voice Memory Demo

Features

Architecture

Quick Start

1. Clone and Install

2. Set Up Environment Variables

3. Run the Development Server

4. Create MongoDB Indexes

Vector Search Index

Atlas Search Index (Text)

How It Works

Memory Tool

Memory Classification

User Isolation

File Structure

Try These Prompts

Technical Details

Audio Format

Model

MongoDB Schema

Troubleshooting

"WebSocket connection error"

No audio playback

Memory not saving

License