
gittech. site
for different kinds of informations and explorations.
Simple RAG pipeline. Dockerized open source
Legit-RAG
A modular Retrieval-Augmented Generation (RAG) system built with FastAPI, Qdrant, and OpenAI.
System Components
- Components - Individual RAG components
- Workflow Components - RAG workflow implementation
- Logging System - Event logging and visualization
Workflow Components
The system follows a 5-step RAG workflow:
Query Routing (
router.py
)- Determines if a query can be answered (ANSWER), needs clarification (CLARIFY), or should be rejected (REJECT)
- Uses LLM to make intelligent routing decisions
- Extensible through
BaseRequestRouter
interface
Query Reformulation (
reformulator.py
)- Refines the original query for better retrieval
- Extracts keywords for hybrid search
- Implements
BaseQueryReformulator
for custom reformulation strategies
Context Retrieval (
retriever.py
)- Performs hybrid search combining:
- Semantic search using embeddings
- Keyword-based search
- Currently uses Qdrant for vector storage
- Extensible through
BaseRetriever
interface
- Performs hybrid search combining:
Completion Check (
completion_checker.py
)- Evaluates if retrieved context is sufficient to answer the query
- Returns confidence score
- Customizable threshold through configuration
- Implements
BaseCompletionChecker
interface
Answer Generation (
answer_generator.py
)- Generates final response using retrieved context
- Includes relevant citations
- Provides confidence scoring
- Extensible through
BaseAnswerGenerator
interface
Extensibility
The system is designed for easy extension and modification:
LLM Providers
- Currently uses OpenAI
- Can be extended to support other providers (Anthropic, Bedrock, etc.)
- Each component uses abstract base classes for provider independence
Vector Databases
- Currently implements Qdrant
- Can be extended to support other vector DBs (Pinecone, Weaviate, etc.)
- Abstract
BaseRetriever
interface for new implementations
Document Management
- Flexible document model with metadata support
- Extensible for different document types and sources
Search Strategies
- Hybrid search combining semantic and keyword approaches
- Customizable result merging strategies
- Extensible for additional search methods
Setup and Installation
Prerequisites
- Python 3.10+
- Docker and Docker Compose
- OpenAI API key
Setup Steps
- Clone the repository:
git clone https://github.com/yourusername/legit-rag.git
cd legit-rag
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Create
.env
file:
cp .env.example .env
- Edit
.env
and add your OpenAI API key:
OPENAI_API_KEY=your-key-here
Running the System
Start our API server and the Qdrant vector database:
docker-compose up -d
- The API will be available at
http://localhost:8000
- The Qdrant db will be available at
http://localhost:6333
To run the API server directly (i.e. in a debugger), note that—after stopping it in Docker—it may be run with:
python -m src.api
API Endpoints
Add Documents
POST /documents
{
"documents": [
{
"text": "Your document text here",
"metadata": {"source": "wiki", "topic": "example"}
}
]
}
Query
POST /query
{
"query": "Your question here"
}
Example Usage
import requests
# Add documents
docs = {
"documents": [
{
"text": "Example document text",
"metadata": {"source": "example"}
}
]
}
response = requests.post("http://localhost:8000/documents", json=docs)
# Query
query = {
"query": "What does the document say?"
}
response = requests.post("http://localhost:8000/query", json=query)
print(response.json())
API Documentation
Once the server is running, you can access the API documentation at:
- Swagger UI:
http://localhost:8000/docs
- ReDoc:
http://localhost:8000/redoc
Configuration
Key configuration options in config.py
:
- LLM models for each component
- Vector DB settings
- Completion threshold
- API endpoints and ports
Future Enhancements
- Provider-agnostic LLM interface
- Support for streaming responses
- Additional vector database implementations
- Enhanced document preprocessing
- Caching layer for frequent queries
- Batch document processing
- Advanced result ranking strategies