RAG System Overview ✅

The Kastrax Retrieval-Augmented Generation (RAG) system provides a comprehensive framework for building knowledge-based applications that combine the power of large language models with external data sources. Built with Kotlin’s type safety and concurrency features, the Kastrax RAG system offers high performance, scalability, and integration with various vector databases and embedding models.

What is RAG? ✅

Retrieval-Augmented Generation (RAG) is a technique that enhances language model outputs by retrieving relevant information from external knowledge sources and incorporating it into the generation process. This approach helps to:

Provide up-to-date information beyond the model’s training data
Ground responses in factual, verifiable sources
Reduce hallucinations and improve accuracy
Customize responses based on domain-specific knowledge
Enable agents to work with proprietary or sensitive information
Support complex reasoning with large knowledge bases
Improve performance on specialized domains

RAG System Architecture ✅

The Kastrax RAG system consists of several key components:

Document Processing: Loading, chunking, and transforming documents with customizable strategies
Embedding: Converting text into vector representations using various embedding models
Vector Storage: Storing and retrieving vectors efficiently with multiple database options
Retrieval: Finding relevant information based on queries with semantic and hybrid search
Reranking: Improving retrieval quality through additional ranking models
Context Building: Constructing effective prompts with retrieved information
Generation: Producing responses using the enhanced context
Metadata Filtering: Filtering results based on document metadata
Graph RAG: Leveraging graph relationships for improved retrieval
Evaluation: Measuring and optimizing RAG system performance

Basic RAG Pipeline ✅

Here’s a simple example of creating a RAG pipeline in Kastrax:


import ai.kastrax.rag.document.DocumentLoader
import ai.kastrax.rag.document.TextSplitter
import ai.kastrax.rag.embedding.EmbeddingModel
import ai.kastrax.rag.vectorstore.VectorStore
import ai.kastrax.rag.retrieval.Retriever
import ai.kastrax.rag.context.ContextBuilder
import ai.kastrax.core.agent.agent
import ai.kastrax.integrations.deepseek.deepSeek
import ai.kastrax.integrations.deepseek.DeepSeekModel
import kotlinx.coroutines.runBlocking
 
fun createRagPipeline() = runBlocking {
    // 1. Load documents
    val loader = DocumentLoader.fromFile("knowledge_base.pdf")
    val documents = loader.load()
 
    // 2. Split documents into chunks
    val splitter = TextSplitter.recursive(
        chunkSize = 1000,
        chunkOverlap = 200
    )
    val chunks = splitter.split(documents)
 
    // 3. Create embeddings
    val embeddingModel = EmbeddingModel.create("all-MiniLM-L6-v2")
 
    // 4. Store in vector database
    val vectorStore = VectorStore.create("chroma")
    vectorStore.addDocuments(chunks, embeddingModel)
 
    // 5. Create retriever
    val retriever = Retriever.fromVectorStore(
        vectorStore = vectorStore,
        embeddingModel = embeddingModel,
        topK = 5
    )
 
    // 6. Create context builder
    val contextBuilder = ContextBuilder.create {
        template("""
            Answer the question based on the following context:
 
            Context:
            {{context}}
 
            Question: {{query}}
 
            Answer:
        """.trimIndent())
    }
 
    // 7. Create RAG agent
    val ragAgent = agent {
        name("RAGAgent")
        description("An agent with RAG capabilities")
 
        // Configure the LLM
        model = deepSeek {
            apiKey("your-deepseek-api-key")
            model(DeepSeekModel.DEEPSEEK_CHAT)
            temperature(0.7)
        }
 
        // Configure RAG
        rag {
            retriever(retriever)
            contextBuilder(contextBuilder)
            maxTokens(3000)
        }
    }
 
    // Use the RAG agent
    val query = "What are the key features of Kastrax?"
    val response = ragAgent.generate(query)
    println(response.text)
}

Document Processing ✅

Loading Documents ✅

Kastrax supports loading documents from various sources:


// Load from a file
val fileLoader = DocumentLoader.fromFile("document.pdf")
 
// Load from a directory
val dirLoader = DocumentLoader.fromDirectory("documents/")
 
// Load from a URL
val webLoader = DocumentLoader.fromUrl("https://example.com/document")
 
// Load from a database
val dbLoader = DocumentLoader.fromDatabase(
    connection = dbConnection,
    query = "SELECT content FROM documents"
)

Document Formats ✅

Kastrax supports multiple document formats:

PDF
Word (DOCX)
Text
HTML
Markdown
CSV
JSON
XML
Databases

Chunking Documents ✅

Documents can be split into chunks for better processing:


// Recursive text splitter (splits by paragraphs, then sentences, etc.)
val recursiveSplitter = TextSplitter.recursive(
    chunkSize = 1000,
    chunkOverlap = 200
)
 
// Character text splitter
val charSplitter = TextSplitter.byCharacter(
    chunkSize = 1000,
    chunkOverlap = 200,
    separator = "\n\n"
)
 
// Token text splitter
val tokenSplitter = TextSplitter.byToken(
    chunkSize = 500,
    chunkOverlap = 100,
    encoderName = "cl100k_base" // OpenAI's tokenizer
)
 
// Semantic text splitter
val semanticSplitter = TextSplitter.semantic(
    embeddingModel = embeddingModel,
    similarityThreshold = 0.7
)

Embedding ✅

Embedding Models ✅

Kastrax supports various embedding models:


// Use a local model
val localModel = EmbeddingModel.create("all-MiniLM-L6-v2")
 
// Use OpenAI embeddings
val openAiModel = EmbeddingModel.openAi(
    apiKey = "your-openai-api-key",
    model = "text-embedding-3-small"
)
 
// Use custom embeddings
val customModel = EmbeddingModel.custom(
    embeddingFunction = { text ->
        // Custom embedding logic
        floatArrayOf(/* ... */)
    },
    dimensions = 384
)

Vector Storage ✅

Vector Stores ✅

Kastrax supports multiple vector stores:


// In-memory vector store
val memoryStore = VectorStore.inMemory()
 
// Chroma vector store
val chromaStore = VectorStore.chroma(
    collectionName = "documents",
    url = "http://localhost:8000"
)
 
// FAISS vector store
val faissStore = VectorStore.faiss(
    indexPath = "faiss_index",
    dimensions = 384
)
 
// Pinecone vector store
val pineconeStore = VectorStore.pinecone(
    apiKey = "your-pinecone-api-key",
    environment = "us-west1-gcp",
    index = "documents"
)

Retrieval ✅

Retrievers ✅

Retrievers find relevant information based on queries:


// Vector store retriever
val vectorRetriever = Retriever.fromVectorStore(
    vectorStore = vectorStore,
    embeddingModel = embeddingModel,
    topK = 5
)
 
// Keyword retriever
val keywordRetriever = Retriever.keyword(
    documents = documents,
    topK = 5
)
 
// Hybrid retriever (combines vector and keyword search)
val hybridRetriever = Retriever.hybrid(
    vectorRetriever = vectorRetriever,
    keywordRetriever = keywordRetriever,
    vectorWeight = 0.7,
    keywordWeight = 0.3
)

Reranking ✅

Rerankers ✅

Rerankers improve retrieval quality:


// Cross-encoder reranker
val crossEncoderReranker = Reranker.crossEncoder(
    model = "cross-encoder/ms-marco-MiniLM-L-6-v2",
    topK = 3
)
 
// LLM reranker
val llmReranker = Reranker.llm(
    llm = deepSeekLlm,
    prompt = "Rank these documents based on relevance to the query: {{query}}\n\nDocuments:\n{{documents}}",
    topK = 3
)

Complete RAG Example ✅

Here’s a complete example of a sophisticated RAG system:


import ai.kastrax.rag.*
import ai.kastrax.core.agent.agent
import ai.kastrax.integrations.deepseek.deepSeek
import ai.kastrax.integrations.deepseek.DeepSeekModel
import kotlinx.coroutines.runBlocking
 
fun createAdvancedRagSystem() = runBlocking {
    // 1. Load and process documents
    val loader = DocumentLoader.fromDirectory("knowledge_base/")
    val documents = loader.load()
 
    val splitter = TextSplitter.recursive(
        chunkSize = 1000,
        chunkOverlap = 200
    )
    val chunks = splitter.split(documents)
 
    // 2. Set up embedding and vector store
    val embeddingModel = EmbeddingModel.create("all-MiniLM-L6-v2")
    val vectorStore = VectorStore.chroma("kastrax_docs")
 
    // 3. Add documents to vector store
    vectorStore.addDocuments(
        documents = chunks,
        embeddingModel = embeddingModel
    )
 
    // 4. Create a hybrid retriever
    val vectorRetriever = Retriever.fromVectorStore(
        vectorStore = vectorStore,
        embeddingModel = embeddingModel,
        topK = 10
    )
 
    val keywordRetriever = Retriever.keyword(
        documents = chunks,
        topK = 10
    )
 
    val hybridRetriever = Retriever.hybrid(
        vectorRetriever = vectorRetriever,
        keywordRetriever = keywordRetriever,
        vectorWeight = 0.7,
        keywordWeight = 0.3
    )
 
    // 5. Add reranking
    val reranker = Reranker.crossEncoder(
        model = "cross-encoder/ms-marco-MiniLM-L-6-v2",
        topK = 5
    )
 
    val enhancedRetriever = Retriever.withReranker(
        baseRetriever = hybridRetriever,
        reranker = reranker
    )
 
    // 6. Create context builder with compression
    val contextBuilder = ContextBuilder.create {
        template("""
            You are a helpful assistant for Kastrax, a powerful AI Agent framework.
            Answer the question based on the following context from the Kastrax documentation.
 
            Context:
            {{#each documents}}
            Document {{@index}} (Source: {{metadata.source}}):
            {{pageContent}}
 
            {{/each}}
 
            Question: {{query}}
 
            Answer the question using only the information provided in the context.
            If you don't have enough information, say "I don't have enough information."
            Always cite the source documents when providing information.
        """.trimIndent())
 
        compression {
            enabled(true)
            method(CompressionMethod.MAP_REDUCE)
            maxTokens(3000)
        }
    }
 
    // 7. Create RAG agent
    val ragAgent = agent {
        name("KastraxDocumentationAssistant")
        description("An assistant that helps with Kastrax documentation")
 
        // Configure the LLM
        model = deepSeek {
            apiKey("your-deepseek-api-key")
            model(DeepSeekModel.DEEPSEEK_CHAT)
            temperature(0.3) // Lower temperature for more factual responses
        }
 
        // Configure RAG
        rag {
            retriever(enhancedRetriever)
            contextBuilder(contextBuilder)
            maxTokens(3000)
        }
    }
 
    // 8. Use the RAG agent
    val query = "How do I implement a custom tool in Kastrax?"
    val response = ragAgent.generate(query)
    println(response.text)
}

Next Steps ✅

Now that you understand the RAG system, you can: