Enhancing Gemini 2.5 with Long-Term Memory Using Mem0

Models & Research

The Engineer

7 Jul 2025 · 3 min read

This article shows how to upgrade Gemini 2.5 with Mem0 for long-term memory, transforming stateless interactions into personalized, context-rich conversations that remember user preferences and history.

By default, large language models (LLMs) like Gemini 2.5 are stateless, meaning they don't retain information from previous interactions. This can be a significant limitation when building personalized and context-aware AI applications. However, by integrating long-term memory systems, you can create more engaging and helpful chatbots that remember user details and provide relevant responses.

In this guide, we'll explore how to add long-term memory to your Gemini 2.5 chatbot using the Mem0 open-source tool. This integration will enable your chatbot to:

Remember details about users from past conversations.
Provide more relevant and personalized answers.
Avoid repetitive questioning.

How Does Mem0 Work?

Mem0 is designed to equip AI agents with scalable long-term memory, addressing the limitations of fixed context windows in LLMs. The process involves four key steps:

Extract Salient Information: Use an LLM to summarize conversations and extract important details from recent messages.
Process Context: Compare new information against existing memories using semantic similarity.
Update Memory: Perform actions like ADD, UPDATE, DELETE, or NOOP for the Mem0g variant (which handles graph data) by extracting entities and relationships.
Retrieve Relevant Memories: Use vector similarity search to fetch relevant memories for response generation.

Mem0 uses vector embeddings to store and retrieve semantic information, maintaining user-specific context across sessions and efficiently retrieving relevant past interactions.

Setup

To get started, you need to install the necessary libraries and obtain an API key:

Install google-genai and mem0ai:

!uv pip install google-genai mem0ai --upgrade

Obtain an API key from Google AI Studio: API Key

Memory Initialization

For building the memory system, you need to configure two main components:

LLM: This model processes the conversation, understands the content, and extracts key information to be stored as memories.
Embedding Model: This model converts extracted text into numerical representations (vectors), allowing Mem0 to efficiently search and retrieve relevant memories.

In this example, we will use Google's Gemini models for both tasks:

LLM: gemini-2.5-flash
Embedding Model: text-embedding-004

We will also use a local Qdrant instance as our vector store. Mem0 supports multiple vector stores, including MongoDB and others.

Implementation Details

Here’s a step-by-step guide to setting up the memory system:

Initialize the LLM and Embedding Model:

from google.genai import LanguageModel, TextEmbedding
from mem0ai import MemoryStore

# Initialize the LLM
llm = LanguageModel.from_pretrained("gemini-2.5-flash")

# Initialize the embedding model
embedding_model = TextEmbedding.from_pretrained("text-embedding-004")

Set Up the Memory Store:

from qdrant_client import QdrantClient

# Connect to a local Qdrant instance
client = QdrantClient(host="localhost", port=6333)

# Initialize the memory store with the LLM and embedding model
memory_store = MemoryStore(llm, embedding_model, client)

Process Conversations:

def process_conversation(user_input, user_id):
    # Get the conversation history for the user
    conversation_history = get_conversation_history(user_id)

    # Extract salient information from the conversation
    summary = llm.summarize(conversation_history + [user_input])

    # Process context and extract new information
    new_info = llm.extract_information(summary, conversation_history)

    # Update the memory store
    memory_store.update_memory(user_id, new_info)

Retrieve Memories:

def get_relevant_memories(user_id, query):
    # Retrieve relevant memories based on the query
    relevant_memories = memory_store.retrieve_memories(user_id, query)

    return relevant_memories