Name: Quartalis
Address: GB
Price range: ££

In Retrieval Augmented Generation (RAG) systems, we often treat all information as equally relevant, regardless of when it was created. But what if the freshness of information really matters? News articles, financial data, and rapidly evolving fields like AI itself demand that we prioritise newer information. This post delves into recency weighting: a technique to inject temporal awareness into your RAG pipeline. We’ll explore when it’s crucial, when it’s overkill, and provide a hands-on guide to implementing it using ChromaDB’s metadata filtering.

Why Recency Matters in RAG

Traditional RAG systems focus primarily on semantic similarity between the user’s query and the documents in your knowledge base. While this is a solid foundation, it neglects a critical dimension: time. Consider these scenarios:

Financial Analysis: A stock recommendation from six months ago might be entirely irrelevant (or even harmful) given the market’s volatility.
Tech Support: Solutions to software bugs can change rapidly. Older forum posts might contain outdated or incorrect advice.
Legal Research: Laws and regulations are constantly evolving. Relying on outdated case law can lead to serious errors.
Tracking Product Development: “What features are planned for version X?” needs to prioritise the most recent project roadmap.

In these cases, the most semantically similar document might not be the most useful. Recency weighting addresses this by incorporating a time-based score into the retrieval process, boosting the relevance of newer documents.

When to Use (and Not Use) Recency Weighting

Before diving into the implementation, let’s clarify when recency weighting is appropriate.

Good Candidates:

Fast-Evolving Information: Fields where knowledge changes rapidly (e.g., technology, finance, current events).
Time-Sensitive Queries: Questions that explicitly or implicitly refer to a specific timeframe (e.g., “What were the main talking points this week?”).
Data with Expiration Dates: Information that becomes invalid or less relevant over time (e.g., promotions, temporary offers).

Poor Candidates:

Timeless Knowledge: Fundamental scientific principles, historical facts, or literary analysis.
Stable Domains: Fields where knowledge evolves slowly (e.g., classic literature, basic mathematics).
Lack of Timestamps: If your documents lack reliable creation or modification dates, recency weighting is impossible.

It’s also worth noting that over-emphasising recency can be detrimental. The oldest document might still be the most comprehensive or contain crucial context that newer documents lack. The key is to find a balance between relevance and recency.

Implementing Recency Weighting with ChromaDB

Let’s walk through a practical example of implementing recency weighting in a RAG system using ChromaDB. We’ll use a simplified news article scenario. First, let’s install the necessary libraries:

pip install chromadb pandas numpy

Here’s some example data consisting of news article snippets and their publication dates. We’ll store these in a Pandas DataFrame.

import pandas as pd
import datetime

data = {
    'article': [
        "Company X announces record profits for Q1 2023.",
        "Analysts predict a downturn in the tech sector.",
        "Company X releases its Q2 2023 earnings report, showing a slight decline.",
        "New AI model achieves state-of-the-art performance.",
        "Company X's stock price plummets after disappointing Q3 2023 results.",
        "Breakthrough in renewable energy technology announced.",
        "Company X recovers slightly in Q4 2023, but concerns remain."
    ],
    'date': [
        datetime.date(2023, 1, 25),
        datetime.date(2023, 2, 10),
        datetime.date(2023, 4, 28),
        datetime.date(2023, 5, 15),
        datetime.date(2023, 7, 30),
        datetime.date(2023, 9, 5),
        datetime.date(2023, 11, 20)
    ]
}

df = pd.DataFrame(data)

Now, we’ll load this data into ChromaDB. We will need to convert the ‘date’ column into a string format suitable for ChromaDB’s metadata.

import chromadb

# Initialize ChromaDB client
client = chromadb.Client()

# Create a new collection
collection = client.create_collection("news_articles")

# Convert dates to ISO format strings
df['date_str'] = df['date'].astype(str)

# Add data to ChromaDB
collection.add(
    documents=df['article'].tolist(),
    metadatas=[{'date': date} for date in df['date_str'].tolist()],
    ids=[f"article_{i}" for i in range(len(df))]
)

print(f"Added {collection.count()} documents to ChromaDB")

With the data in ChromaDB, we can now implement the recency weighting. A common approach is to use an exponential decay function. This function assigns a higher score to more recent documents, with the score decreasing exponentially as the document age increases.

Here’s how to define a recency weighting function:

import numpy as np
from datetime import datetime

def recency_weight(date_str, halflife=90):
    """
    Calculates a recency weight based on an exponential decay function.

    Args:
        date_str (str): Date string in ISO format (YYYY-MM-DD).
        halflife (int): The number of days it takes for the weight to reduce by half.

    Returns:
        float: Recency weight between 0 and 1.
    """
    date = datetime.strptime(date_str, '%Y-%m-%d').date()
    today = datetime.now().date()
    age = (today - date).days
    weight = np.exp(-np.log(2) * age / halflife)  # Exponential decay
    return weight

# Example usage:
today = datetime.now().date()
recent_weight = recency_weight(str(today))
old_weight = recency_weight("2023-01-01")

print(f"Weight for today: {recent_weight:.2f}")
print(f"Weight for 2023-01-01: {old_weight:.2f}")

The halflife parameter controls how quickly the weight decays. A smaller halflife means that older documents are penalised more severely. Experiment with different values to find the optimal balance for your specific use case.

Now, let’s integrate this recency weighting into our ChromaDB query. We’ll combine it with a standard similarity search.

def weighted_search(query, k=3, halflife=90, alpha=0.75):
    """
    Performs a weighted search in ChromaDB, combining similarity and recency.

    Args:
        query (str): The search query.
        k (int): The number of results to return.
        halflife (int): The halflife for the recency weighting.
        alpha (float): The weight given to similarity (1-alpha is given to recency).

    Returns:
        list: A list of tuples, each containing (document, metadata, combined_score).
    """
    results = collection.query(
        query_texts=[query],
        n_results=k
    )

    documents = results['documents'][0]
    metadatas = results['metadatas'][0]
    distances = results['distances'][0]

    weighted_results = []
    for doc, meta, dist in zip(documents, metadatas, distances):
        similarity_score = 1 - dist  # Convert distance to similarity (higher is better)
        recency = recency_weight(meta['date'], halflife)
        combined_score = alpha * similarity_score + (1 - alpha) * recency
        weighted_results.append((doc, meta, combined_score))

    # Sort by the combined score
    weighted_results.sort(key=lambda x: x[2], reverse=True)
    return weighted_results

# Example query
query = "Company X financial performance"
results = weighted_search(query)

for doc, meta, score in results:
    print(f"Document: {doc}")
    print(f"Date: {meta['date']}")
    print(f"Score: {score:.2f}")
    print("-" * 20)

In this weighted_search function:

We perform a standard similarity search using ChromaDB’s query method.
We convert the distance from the vector search into a similarity score.
We calculate the recency weight for each document using our recency_weight function.
We combine the similarity score and recency weight using a weighted average, controlled by the alpha parameter. alpha determines the balance between semantic relevance and recency. A higher alpha prioritises semantic similarity, while a lower alpha prioritises recency.
Finally, we sort the results by the combined score and return the top k documents.

By adjusting the halflife and alpha parameters, you can fine-tune the recency weighting to match the specific characteristics of your data and the needs of your application.

Considerations and Enhancements

Normalisation: Consider normalising the similarity scores and recency weights to a common scale (e.g., 0 to 1) before combining them. This prevents one factor from dominating the other due to differing ranges.
Metadata Filtering: Use ChromaDB’s metadata filtering capabilities to further refine your search. For example, you could filter by document type or source before applying recency weighting. The Quartalis platform provides a managed ChromaDB instance with advanced filtering options, making it easier to build complex, time-aware RAG pipelines.
Dynamic Halflife: In some cases, a fixed halflife might not be appropriate. Consider using a dynamic halflife that adjusts based on the query or the document’s content.
Hybrid Approaches: Experiment with other approaches to incorporating time, such as re-ranking the initial search results based on recency or using a time-aware language model.

Wrapping Up

Recency weighting is a powerful technique for improving the relevance of RAG systems when dealing with time-sensitive information. By combining semantic similarity with temporal awareness, you can ensure that your users receive the most up-to-date and relevant information. The implementation demonstrated here with ChromaDB provides a solid foundation for building time-aware RAG pipelines. Remember to carefully consider the characteristics of your data and the specific needs of your application when choosing the appropriate recency weighting strategy and parameters.

Recency Weighting in RAG: When Newer Information Matters More

Why Recency Matters in RAG

When to Use (and Not Use) Recency Weighting

Implementing Recency Weighting with ChromaDB

Considerations and Enhancements

Wrapping Up

Related Posts

Contextual Compression: Making Retrieved Chunks Actually Relevant

HyDE: How Hypothetical Document Embeddings Supercharge RAG Retrieval

Topic Consolidation: Turning Thousands of Chunks into Structured Knowledge