Skip to Content
DocsRAGOverview

RAG System Overview ✅

The Kastrax Retrieval-Augmented Generation (RAG) system provides a comprehensive framework for building knowledge-based applications that combine the power of large language models with external data sources. Built with Kotlin’s type safety and concurrency features, the Kastrax RAG system offers high performance, scalability, and integration with various vector databases and embedding models.

What is RAG? ✅

Retrieval-Augmented Generation (RAG) is a technique that enhances language model outputs by retrieving relevant information from external knowledge sources and incorporating it into the generation process. This approach helps to:

  • Provide up-to-date information beyond the model’s training data
  • Ground responses in factual, verifiable sources
  • Reduce hallucinations and improve accuracy
  • Customize responses based on domain-specific knowledge
  • Enable agents to work with proprietary or sensitive information
  • Support complex reasoning with large knowledge bases
  • Improve performance on specialized domains

RAG System Architecture ✅

The Kastrax RAG system consists of several key components:

  1. Document Processing: Loading, chunking, and transforming documents with customizable strategies
  2. Embedding: Converting text into vector representations using various embedding models
  3. Vector Storage: Storing and retrieving vectors efficiently with multiple database options
  4. Retrieval: Finding relevant information based on queries with semantic and hybrid search
  5. Reranking: Improving retrieval quality through additional ranking models
  6. Context Building: Constructing effective prompts with retrieved information
  7. Generation: Producing responses using the enhanced context
  8. Metadata Filtering: Filtering results based on document metadata
  9. Graph RAG: Leveraging graph relationships for improved retrieval
  10. Evaluation: Measuring and optimizing RAG system performance

Basic RAG Pipeline ✅

Here’s a simple example of creating a RAG pipeline in Kastrax:

import ai.kastrax.rag.document.DocumentLoader import ai.kastrax.rag.document.TextSplitter import ai.kastrax.rag.embedding.EmbeddingModel import ai.kastrax.rag.vectorstore.VectorStore import ai.kastrax.rag.retrieval.Retriever import ai.kastrax.rag.context.ContextBuilder import ai.kastrax.core.agent.agent import ai.kastrax.integrations.deepseek.deepSeek import ai.kastrax.integrations.deepseek.DeepSeekModel import kotlinx.coroutines.runBlocking fun createRagPipeline() = runBlocking { // 1. Load documents val loader = DocumentLoader.fromFile("knowledge_base.pdf") val documents = loader.load() // 2. Split documents into chunks val splitter = TextSplitter.recursive( chunkSize = 1000, chunkOverlap = 200 ) val chunks = splitter.split(documents) // 3. Create embeddings val embeddingModel = EmbeddingModel.create("all-MiniLM-L6-v2") // 4. Store in vector database val vectorStore = VectorStore.create("chroma") vectorStore.addDocuments(chunks, embeddingModel) // 5. Create retriever val retriever = Retriever.fromVectorStore( vectorStore = vectorStore, embeddingModel = embeddingModel, topK = 5 ) // 6. Create context builder val contextBuilder = ContextBuilder.create { template(""" Answer the question based on the following context: Context: {{context}} Question: {{query}} Answer: """.trimIndent()) } // 7. Create RAG agent val ragAgent = agent { name("RAGAgent") description("An agent with RAG capabilities") // Configure the LLM model = deepSeek { apiKey("your-deepseek-api-key") model(DeepSeekModel.DEEPSEEK_CHAT) temperature(0.7) } // Configure RAG rag { retriever(retriever) contextBuilder(contextBuilder) maxTokens(3000) } } // Use the RAG agent val query = "What are the key features of Kastrax?" val response = ragAgent.generate(query) println(response.text) }

Document Processing ✅

Loading Documents ✅

Kastrax supports loading documents from various sources:

// Load from a file val fileLoader = DocumentLoader.fromFile("document.pdf") // Load from a directory val dirLoader = DocumentLoader.fromDirectory("documents/") // Load from a URL val webLoader = DocumentLoader.fromUrl("https://example.com/document") // Load from a database val dbLoader = DocumentLoader.fromDatabase( connection = dbConnection, query = "SELECT content FROM documents" )

Document Formats ✅

Kastrax supports multiple document formats:

  • PDF
  • Word (DOCX)
  • Text
  • HTML
  • Markdown
  • CSV
  • JSON
  • XML
  • Databases

Chunking Documents ✅

Documents can be split into chunks for better processing:

// Recursive text splitter (splits by paragraphs, then sentences, etc.) val recursiveSplitter = TextSplitter.recursive( chunkSize = 1000, chunkOverlap = 200 ) // Character text splitter val charSplitter = TextSplitter.byCharacter( chunkSize = 1000, chunkOverlap = 200, separator = "\n\n" ) // Token text splitter val tokenSplitter = TextSplitter.byToken( chunkSize = 500, chunkOverlap = 100, encoderName = "cl100k_base" // OpenAI's tokenizer ) // Semantic text splitter val semanticSplitter = TextSplitter.semantic( embeddingModel = embeddingModel, similarityThreshold = 0.7 )

Embedding ✅

Embedding Models ✅

Kastrax supports various embedding models:

// Use a local model val localModel = EmbeddingModel.create("all-MiniLM-L6-v2") // Use OpenAI embeddings val openAiModel = EmbeddingModel.openAi( apiKey = "your-openai-api-key", model = "text-embedding-3-small" ) // Use custom embeddings val customModel = EmbeddingModel.custom( embeddingFunction = { text -> // Custom embedding logic floatArrayOf(/* ... */) }, dimensions = 384 )

Vector Storage ✅

Vector Stores ✅

Kastrax supports multiple vector stores:

// In-memory vector store val memoryStore = VectorStore.inMemory() // Chroma vector store val chromaStore = VectorStore.chroma( collectionName = "documents", url = "http://localhost:8000" ) // FAISS vector store val faissStore = VectorStore.faiss( indexPath = "faiss_index", dimensions = 384 ) // Pinecone vector store val pineconeStore = VectorStore.pinecone( apiKey = "your-pinecone-api-key", environment = "us-west1-gcp", index = "documents" )

Retrieval ✅

Retrievers ✅

Retrievers find relevant information based on queries:

// Vector store retriever val vectorRetriever = Retriever.fromVectorStore( vectorStore = vectorStore, embeddingModel = embeddingModel, topK = 5 ) // Keyword retriever val keywordRetriever = Retriever.keyword( documents = documents, topK = 5 ) // Hybrid retriever (combines vector and keyword search) val hybridRetriever = Retriever.hybrid( vectorRetriever = vectorRetriever, keywordRetriever = keywordRetriever, vectorWeight = 0.7, keywordWeight = 0.3 )

Reranking ✅

Rerankers ✅

Rerankers improve retrieval quality:

// Cross-encoder reranker val crossEncoderReranker = Reranker.crossEncoder( model = "cross-encoder/ms-marco-MiniLM-L-6-v2", topK = 3 ) // LLM reranker val llmReranker = Reranker.llm( llm = deepSeekLlm, prompt = "Rank these documents based on relevance to the query: {{query}}\n\nDocuments:\n{{documents}}", topK = 3 )

Complete RAG Example ✅

Here’s a complete example of a sophisticated RAG system:

import ai.kastrax.rag.* import ai.kastrax.core.agent.agent import ai.kastrax.integrations.deepseek.deepSeek import ai.kastrax.integrations.deepseek.DeepSeekModel import kotlinx.coroutines.runBlocking fun createAdvancedRagSystem() = runBlocking { // 1. Load and process documents val loader = DocumentLoader.fromDirectory("knowledge_base/") val documents = loader.load() val splitter = TextSplitter.recursive( chunkSize = 1000, chunkOverlap = 200 ) val chunks = splitter.split(documents) // 2. Set up embedding and vector store val embeddingModel = EmbeddingModel.create("all-MiniLM-L6-v2") val vectorStore = VectorStore.chroma("kastrax_docs") // 3. Add documents to vector store vectorStore.addDocuments( documents = chunks, embeddingModel = embeddingModel ) // 4. Create a hybrid retriever val vectorRetriever = Retriever.fromVectorStore( vectorStore = vectorStore, embeddingModel = embeddingModel, topK = 10 ) val keywordRetriever = Retriever.keyword( documents = chunks, topK = 10 ) val hybridRetriever = Retriever.hybrid( vectorRetriever = vectorRetriever, keywordRetriever = keywordRetriever, vectorWeight = 0.7, keywordWeight = 0.3 ) // 5. Add reranking val reranker = Reranker.crossEncoder( model = "cross-encoder/ms-marco-MiniLM-L-6-v2", topK = 5 ) val enhancedRetriever = Retriever.withReranker( baseRetriever = hybridRetriever, reranker = reranker ) // 6. Create context builder with compression val contextBuilder = ContextBuilder.create { template(""" You are a helpful assistant for Kastrax, a powerful AI Agent framework. Answer the question based on the following context from the Kastrax documentation. Context: {{#each documents}} Document {{@index}} (Source: {{metadata.source}}): {{pageContent}} {{/each}} Question: {{query}} Answer the question using only the information provided in the context. If you don't have enough information, say "I don't have enough information." Always cite the source documents when providing information. """.trimIndent()) compression { enabled(true) method(CompressionMethod.MAP_REDUCE) maxTokens(3000) } } // 7. Create RAG agent val ragAgent = agent { name("KastraxDocumentationAssistant") description("An assistant that helps with Kastrax documentation") // Configure the LLM model = deepSeek { apiKey("your-deepseek-api-key") model(DeepSeekModel.DEEPSEEK_CHAT) temperature(0.3) // Lower temperature for more factual responses } // Configure RAG rag { retriever(enhancedRetriever) contextBuilder(contextBuilder) maxTokens(3000) } } // 8. Use the RAG agent val query = "How do I implement a custom tool in Kastrax?" val response = ragAgent.generate(query) println(response.text) }

Next Steps ✅

Now that you understand the RAG system, you can:

  1. Learn about document processing
  2. Explore vector stores
  3. Implement advanced retrieval techniques
  4. Set up RAG evaluation
Last updated on