Retrieval in Kastrax RAG Systems ✅

Retrieval is the critical middle step in any RAG (Retrieval-Augmented Generation) system, responsible for finding the most relevant information to answer user queries. Kastrax provides a comprehensive retrieval system with multiple strategies to ensure high-quality, relevant results.

Retrieval Architecture ✅

Kastrax’s retrieval system is built on a flexible, multi-layered architecture that enables sophisticated information retrieval:

Query Processing: Transforms user queries into vector embeddings and structured search parameters
Vector Search: Performs similarity search against embedded documents in vector databases
Metadata Filtering: Applies structured filters to narrow down results based on document attributes
Reranking: Refines initial results using more sophisticated relevance algorithms
Result Processing: Formats and prepares retrieved information for generation

This architecture allows for a range of retrieval strategies, from simple vector similarity search to complex hybrid approaches combining multiple retrieval methods.

How Retrieval Works ✅

The retrieval process in Kastrax follows these key steps:

Query Embedding: The user’s query is converted to a vector embedding using the same model used for document embeddings
Vector Similarity: This embedding is compared to stored document embeddings using vector similarity metrics (cosine, dot product, or Euclidean distance)
Result Ranking: The most similar chunks are retrieved and ranked by similarity score
Optional Processing: Retrieved chunks can be further processed through:
- Metadata filtering to narrow results based on document attributes
- Reranking to improve relevance using more sophisticated algorithms
- Hybrid retrieval combining vector search with keyword or graph-based approaches

Basic Retrieval ✅

The simplest approach is direct semantic search, which uses vector similarity to find chunks that are semantically similar to the query:

BasicRetrieval.kt


import ai.kastrax.rag.embedding.OpenAIEmbedder
import ai.kastrax.rag.vectordb.PgVectorAdapter
import ai.kastrax.rag.retrieval.RetrievalOptions
 
// Create an embedder for query embedding
val embedder = OpenAIEmbedder(
    options = EmbeddingOptions(
        modelName = "text-embedding-3-small",
        apiKey = System.getenv("OPENAI_API_KEY")
    )
)
 
// Generate embedding for the query
val query = "What are the main points in the article?"
val queryEmbedding = embedder.embedText(query)
 
// Create a vector database adapter
val vectorDB = PgVectorAdapter(
    connectionString = System.getenv("POSTGRES_CONNECTION_STRING")
)
 
// Perform semantic search
val results = vectorDB.search(
    indexName = "document_embeddings",
    embedding = queryEmbedding,
    limit = 10,
    minScore = 0.7  // Only return results with similarity score >= 0.7
)
 
// Process the results
results.forEach { result ->
    println("Score: ${result.score}")
    println("Content: ${result.chunk.content.take(100)}...")
    println("Metadata: ${result.chunk.metadata}")
    println()
}

Results include both the document content and a similarity score:


// Example search result
SearchResult(
    chunk = EmbeddedChunk(
        content = "Climate change poses significant challenges...",
        embedding = floatArrayOf(...),  // Vector embedding
        metadata = mapOf(
            "source" to "article1.txt",
            "page" to 1,
            "category" to "environment"
        )
    ),
    score = 0.89  // Similarity score (0.0-1.0)
)

Using the Retrieval Engine

Kastrax provides a high-level RetrievalEngine that simplifies the retrieval process:

RetrievalEngine.kt


import ai.kastrax.rag.retrieval.RetrievalEngine
import ai.kastrax.rag.retrieval.RetrievalOptions
 
// Create a retrieval engine with the vector database
val retrievalEngine = RetrievalEngine(
    vectorDB = vectorDB,
    embedder = embedder
)
 
// Retrieve relevant chunks for a query
val retrievedChunks = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        limit = 5,
        minScore = 0.7
    )
)
 
// Process the retrieved chunks
retrievedChunks.forEach { result ->
    println("Score: ${result.score}")
    println("Content: ${result.chunk.content}")
}

The RetrievalEngine handles query embedding and search in a single operation, making it easier to implement retrieval in your applications.

Advanced Retrieval Strategies ✅

Metadata Filtering

Metadata filtering allows you to narrow down search results based on document attributes. This is particularly useful when you have documents from different sources, time periods, or with specific characteristics. Kastrax provides a unified MongoDB-style query syntax that works across all supported vector databases.

Here are examples of different filtering approaches:

MetadataFiltering.kt


import ai.kastrax.rag.retrieval.RetrievalEngine
import ai.kastrax.rag.retrieval.RetrievalOptions
 
// Create a retrieval engine
val retrievalEngine = RetrievalEngine(vectorDB, embedder)
 
// Simple equality filter
val results1 = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        filter = mapOf(
            "source" to "article1.txt"  // Only return documents from this source
        )
    )
)
 
// Numeric comparison
val results2 = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        filter = mapOf(
            "price" to mapOf(
                "$gt" to 100  // Only return documents with price > 100
            )
        )
    )
)
 
// Multiple conditions
val results3 = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        filter = mapOf(
            "category" to "electronics",
            "price" to mapOf(
                "$lt" to 1000  // Price < 1000
            ),
            "inStock" to true
        )
    )
)
 
// Array operations
val results4 = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        filter = mapOf(
            "tags" to mapOf(
                "$in" to listOf("sale", "new")  // Tags include either "sale" or "new"
            )
        )
    )
)
 
// Logical operators
val results5 = retrievalEngine.retrieve(
    query = "What are the main points in the article?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        filter = mapOf(
            "$or" to listOf(
                mapOf("category" to "electronics"),
                mapOf("category" to "accessories")
            ),
            "$and" to listOf(
                mapOf("price" to mapOf("$gt" to 50)),
                mapOf("price" to mapOf("$lt" to 200))
            )
        )
    )
)

Common Filter Operators

Operator	Description	Example
Equality	Direct match	`"category" to "electronics"`
`$gt`	Greater than	`"price" to mapOf("$gt" to 100)`
`$gte`	Greater than or equal	`"price" to mapOf("$gte" to 100)`
`$lt`	Less than	`"price" to mapOf("$lt" to 1000)`
`$lte`	Less than or equal	`"price" to mapOf("$lte" to 1000)`
`$in`	In array	`"tags" to mapOf("$in" to listOf("sale", "new"))`
`$nin`	Not in array	`"tags" to mapOf("$nin" to listOf("discontinued"))`
`$or`	Logical OR	`"$or" to listOf(mapOf(...), mapOf(...))`
`$and`	Logical AND	`"$and" to listOf(mapOf(...), mapOf(...))`
`$not`	Logical NOT	`"$not" to mapOf("category" to "outdated")`

Common Use Cases for Metadata Filtering

Source Filtering: Limit results to specific document sources or types
Date Filtering: Retrieve documents from specific time periods
Category Filtering: Focus on documents with specific categories or tags
Numerical Filtering: Filter by numerical ranges (e.g., price, rating, page count)
Attribute Filtering: Select documents with specific attributes (e.g., language, author, format)
Combination Filtering: Combine multiple conditions for precise querying

Vector Query Tool

Sometimes you want to give your agent the ability to query a vector database directly. The Vector Query Tool allows your agent to be in charge of retrieval decisions, combining semantic search with optional filtering and reranking based on the agent’s understanding of the user’s needs.

VectorQueryTool.kt


import ai.kastrax.rag.vectordb.PgVectorAdapter
import ai.kastrax.rag.tools.VectorQueryTool
import ai.kastrax.rag.embedding.OpenAIEmbedder
import ai.kastrax.core.agent.Agent
import ai.kastrax.core.agent.AgentBuilder
 
// Create a vector database adapter
val vectorDB = PgVectorAdapter(
    connectionString = System.getenv("POSTGRES_CONNECTION_STRING")
)
 
// Create an embedder
val embedder = OpenAIEmbedder()
 
// Create the vector query tool
val vectorQueryTool = VectorQueryTool(
    vectorDB = vectorDB,
    embedder = embedder,
    indexName = "document_embeddings",
    description = "Search for information in the knowledge base"
)
 
// Create an agent with the vector query tool
val agent = AgentBuilder()
    .withTools(listOf(vectorQueryTool))
    .withModel("gpt-4")
    .withSystemPrompt("You are a helpful assistant with access to a knowledge base.")
    .build()
 
// The agent can now use the tool to query the vector database
val result = agent.run("Find information about climate change impacts on agriculture")

When creating the tool, pay special attention to the tool’s name and description - these help the agent understand when and how to use the retrieval capabilities. For example, you might name it “SearchKnowledgeBase” and describe it as “Search through our documentation to find relevant information about specific topics.”

The agent can use this tool to:

Formulate semantic search queries
Apply metadata filters
Adjust the number of results
Rerank results based on relevance to the user’s question

This approach gives the agent more control over the retrieval process, allowing it to adapt its search strategy based on the conversation context and user needs.

Advanced Retrieval Techniques ✅

Hybrid Search

Hybrid search combines vector similarity search with traditional keyword search to improve retrieval quality. This approach is particularly useful when you need to find documents that are both semantically similar to a query and contain specific keywords.

HybridSearch.kt


import ai.kastrax.rag.retrieval.HybridRetrievalEngine
import ai.kastrax.rag.retrieval.HybridRetrievalOptions
 
// Create a hybrid retrieval engine
val hybridEngine = HybridRetrievalEngine(
    vectorDB = vectorDB,
    embedder = embedder,
    textIndexer = textIndexer  // For keyword search
)
 
// Perform hybrid search
val results = hybridEngine.retrieve(
    query = "climate change agriculture drought resistant crops",
    options = HybridRetrievalOptions(
        indexName = "document_embeddings",
        vectorWeight = 0.7,  // Weight for vector similarity (0.0-1.0)
        textWeight = 0.3,    // Weight for text similarity (0.0-1.0)
        limit = 10
    )
)

Hybrid search is particularly effective for:

Queries containing specific technical terms or proper nouns
Situations where exact keyword matching is important
Improving precision while maintaining recall

Reranking

Reranking improves retrieval quality by applying a more sophisticated relevance model to the initial search results. Kastrax supports multiple reranking approaches:

Reranking.kt


import ai.kastrax.rag.retrieval.RetrievalEngine
import ai.kastrax.rag.retrieval.RetrievalOptions
import ai.kastrax.rag.reranking.CrossEncoderReranker
import ai.kastrax.rag.reranking.RerankerOptions
 
// Create a reranker
val reranker = CrossEncoderReranker(
    options = RerankerOptions(
        modelName = "cross-encoder/ms-marco-MiniLM-L-6-v2",
        threshold = 0.7  // Minimum score to include in results
    )
)
 
// Create a retrieval engine with reranking
val retrievalEngine = RetrievalEngine(
    vectorDB = vectorDB,
    embedder = embedder,
    reranker = reranker
)
 
// Retrieve and rerank results
val results = retrievalEngine.retrieve(
    query = "What are the effects of climate change on agriculture?",
    options = RetrievalOptions(
        indexName = "document_embeddings",
        limit = 20,  // Retrieve more initial results for reranking
        rerank = true  // Enable reranking
    )
)

Reranking is particularly useful for:

Improving precision in the final results
Handling complex or ambiguous queries
Reducing the impact of embedding model limitations

Contextual Retrieval

Contextual retrieval takes into account the conversation history or user context when retrieving information:

ContextualRetrieval.kt


import ai.kastrax.rag.retrieval.ContextualRetrievalEngine
import ai.kastrax.rag.retrieval.ContextualRetrievalOptions
import ai.kastrax.rag.context.ConversationContext
 
// Create a conversation context
val conversationContext = ConversationContext()
 
// Add messages to the context
conversationContext.addMessage("user", "Tell me about climate change impacts.")
conversationContext.addMessage("assistant", "Climate change has various impacts including rising temperatures, sea level rise, and extreme weather events.")
 
// Create a contextual retrieval engine
val contextualEngine = ContextualRetrievalEngine(
    vectorDB = vectorDB,
    embedder = embedder
)
 
// Perform contextual retrieval
val results = contextualEngine.retrieve(
    query = "What about its effects on agriculture specifically?",
    options = ContextualRetrievalOptions(
        indexName = "document_embeddings",
        context = conversationContext,
        contextWeight = 0.3,  // Weight for conversation context
        limit = 5
    )
)

Contextual retrieval is valuable for:

Maintaining conversation coherence
Resolving ambiguous references in follow-up questions
Providing more relevant results based on user interests

Best Practices ✅

Query Formulation

Be specific: Craft queries that are specific and focused on the information you need
Include key terms: Include important keywords and technical terms in your queries
Avoid ambiguity: Phrase queries to minimize ambiguity and multiple interpretations
Consider query expansion: For important queries, try multiple formulations to improve recall

Retrieval Configuration

Adjust result limits: Retrieve more results for reranking (e.g., 20-50) but fewer for direct use (e.g., 3-5)
Use appropriate similarity metrics: Choose the right similarity metric (cosine, dot product, Euclidean) for your embedding model
Set minimum similarity thresholds: Filter out low-relevance results with minimum score thresholds
Combine retrieval strategies: Use hybrid search and reranking for important queries

Metadata Management

Design metadata thoughtfully: Include metadata fields that will be useful for filtering
Be consistent: Use consistent naming and formatting for metadata fields
Include hierarchical metadata: Add category/subcategory relationships for more flexible filtering
Add temporal metadata: Include creation/update dates to filter by recency

Complete RAG Pipeline ✅

Here’s a complete example showing how retrieval fits into the full RAG pipeline:

CompleteRAGPipeline.kt


import ai.kastrax.rag.document.Document
import ai.kastrax.rag.chunking.RecursiveChunker
import ai.kastrax.rag.embedding.OpenAIEmbedder
import ai.kastrax.rag.vectordb.PineconeAdapter
import ai.kastrax.rag.retrieval.RetrievalEngine
import ai.kastrax.rag.reranking.CrossEncoderReranker
import ai.kastrax.rag.generation.LLMGenerator
import ai.kastrax.rag.prompt.PromptTemplate
 
// 1. Document Processing
val document = Document.fromText("""
    # Climate Change and Agriculture
 
    Climate change poses significant challenges to global agriculture.
    Rising temperatures and changing precipitation patterns affect crop yields.
    Farmers must adapt to these changing conditions to ensure food security.
 
    ## Effects on Crop Yields
 
    Studies have shown that for each degree Celsius of warming, there is a
    potential reduction in global yields of wheat by 6%, rice by 3.2%,
    maize by 7.4%, and soybean by 3.1%.
 
    ## Adaptation Strategies
 
    Farmers are implementing various strategies to adapt to climate change:
    - Drought-resistant crop varieties
    - Improved irrigation systems
    - Diversification of crops
    - Precision agriculture techniques
""".trimIndent())
 
// 2. Chunking
val chunker = RecursiveChunker()
val chunks = chunker.chunk(document)
 
// 3. Embedding
val embedder = OpenAIEmbedder()
val embeddedChunks = embedder.embed(chunks)
 
// 4. Vector Storage
val vectorDB = PineconeAdapter()
vectorDB.upsert("document_embeddings", embeddedChunks)
 
// 5. Reranker Setup
val reranker = CrossEncoderReranker()
 
// 6. Retrieval Engine
val retrievalEngine = RetrievalEngine(
    vectorDB = vectorDB,
    embedder = embedder,
    reranker = reranker
)
 
// 7. Query and Retrieve
val query = "How does climate change affect wheat production?"
val retrievedChunks = retrievalEngine.retrieve(
    query = query,
    options = RetrievalOptions(
        indexName = "document_embeddings",
        limit = 3,
        rerank = true
    )
)
 
// 8. Format Context
val context = retrievedChunks.joinToString("\n\n") { it.chunk.content }
 
// 9. Generate Response
val promptTemplate = PromptTemplate(
    template = """
    Answer the following question based on the provided context.
 
    Context:
    {{context}}
 
    Question:
    {{question}}
 
    Answer:
    """.trimIndent()
)
 
val prompt = promptTemplate.format(mapOf(
    "context" to context,
    "question" to query
))
 
val generator = LLMGenerator()
val answer = generator.generate(prompt)
 
// 10. Return Result
println("Query: $query")
println("Answer: $answer")

Conclusion ✅

Effective retrieval is the cornerstone of any successful RAG system. Kastrax provides a comprehensive suite of retrieval capabilities that enable you to find the most relevant information for your users’ queries.

By leveraging advanced techniques like metadata filtering, hybrid search, reranking, and contextual retrieval, you can significantly improve the quality of your RAG system’s responses. Remember to follow best practices for query formulation, retrieval configuration, and metadata management to get the most out of Kastrax’s retrieval capabilities.

The right retrieval strategy depends on your specific use case, data characteristics, and performance requirements. Experiment with different approaches and combinations to find the optimal solution for your application.

Knowledge Graph Retrieval

For documents with complex relationships, knowledge graph retrieval can follow connections between chunks. This approach is particularly useful when:

KnowledgeGraphRetrieval.kt


import ai.kastrax.rag.retrieval.GraphRetrievalEngine
import ai.kastrax.rag.retrieval.GraphRetrievalOptions
import ai.kastrax.rag.graph.KnowledgeGraph
 
// Create a knowledge graph
val knowledgeGraph = KnowledgeGraph()
 
// Add nodes and relationships from embedded chunks
embeddedChunks.forEach { chunk ->
    knowledgeGraph.addNode(
        id = chunk.id,
        content = chunk.content,
        embedding = chunk.embedding,
        metadata = chunk.metadata
    )
}
 
// Add relationships between nodes
knowledgeGraph.addRelationship(
    sourceId = "chunk1",
    targetId = "chunk2",
    type = "references",
    weight = 0.8
)
 
// Create a graph retrieval engine
val graphEngine = GraphRetrievalEngine(
    knowledgeGraph = knowledgeGraph,
    embedder = embedder
)
 
// Retrieve information using graph traversal
val results = graphEngine.retrieve(
    query = "How does climate change affect agriculture?",
    options = GraphRetrievalOptions(
        maxHops = 2,  // Maximum relationship traversal depth
        limit = 5,    // Maximum number of results
        traversalStrategy = "breadth-first"  // BFS or DFS
    )
)

Knowledge graph retrieval is particularly valuable when:

Information is spread across multiple documents
Documents reference each other
You need to traverse relationships to find complete answers
The domain has complex entity relationships