Create and Manage your knowledge base with Simba and connect it to any RAG app

Published at

Feb 5, 2025

Main Article

Simba - Your Knowledge Management System

Connect your knowledge to any RAG system

Simba is an open source, portable KMS (knowledge management system) designed to integrate seamlessly with any Retrieval-Augmented Generation (RAG) system. With a modern UI and modular architecture, Simba allows developers to focus on building advanced AI solutions without worrying about the complexities of knowledge management.

Table of Contents

🚀 Features

🧩 Modular Architecture: Plug in various vector stores, embedding models, chunkers, and parsers.
🖥️ Modern UI: Intuitive user interface to visualize and modify every document chunk.
🔗 Seamless Integration: Easily integrates with any RAG-based system.
👨‍💻 Developer Focus: Simplifies knowledge management so you can concentrate on building core AI functionality.
📦 Open Source & Extensible: Community-driven, with room for custom features and integrations.

🎥 Demo

Watch the demo

🛠️ Getting Started

📋 Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.11+ & poetry
Redis 7.0+
Node.js 20+
Git for version control.
(Optional) Docker for containerized deployment.

📦 Installation

install simba-core:

pip install simba-core

Clone the repository and install dependencies:

git clone https://github.com/GitHamza0206/simba.git
cd simba
poetry config virtualenvs.in-project true
poetry install
source .venv/bin/activate

🔑 Configuration

Create a .env file in the root directory:

OPENAI_API_KEY=your_openai_api_key
REDIS_HOST=localhost
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/1

create or update config.yaml file in the root directory:

# config.yaml

project:
  name: "Simba"
  version: "1.0.0"
  api_version: "/api/v1"

paths:
  base_dir: null  # Will be set programmatically
  faiss_index_dir: "vector_stores/faiss_index"
  vector_store_dir: "vector_stores"

llm:
  provider: "openai"
  model_name: "gpt-4o-mini"
  temperature: 0.0
  max_tokens: null
  streaming: true
  additional_params: {}

embedding:
  provider: "huggingface"
  model_name: "BAAI/bge-base-en-v1.5"
  device: "mps"  # Changed from mps to cpu for container compatibility
  additional_params: {}

vector_store:
  provider: "faiss"
  collection_name: "simba_collection"

  additional_params: {}

chunking:
  chunk_size: 512
  chunk_overlap: 200

retrieval:
  k: 5

celery: 
  broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
  result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}

🚀 Run Simba

Run the server:

simba server

Run the frontend:

simba front

Run the parsers:

simba parsers

🐳 Docker Deployment

Run on Specific Hardware

For CPU:

DEVICE=cpu make build
DEVICE=cpu make up

For NVIDIA GPU with Ollama:

DEVICE=cuda make build
DEVICE=cuda make up

For Apple Silicon:

# Note: MPS (Metal Performance Shaders) is NOT supported in Docker containers
# For Docker, always use CPU mode even on Apple Silicon:
DEVICE=cpu make build
DEVICE=cpu make up

Run with Ollama service (for CPU):

DEVICE=cpu ENABLE_OLLAMA=true make up

Run in background mode:

# All commands run in detached mode by default

For detailed Docker instructions, see the Docker deployment guide.

🏁 Roadmap

💻 pip install simba-core
🔧 pip install simba-sdk
🌐 www.simba-docs.com
🔒 Adding Auth & access management
🕸️ Adding web scraping
☁️ Pulling data from Azure / AWS / GCP
📚 More parsers and chunkers available
🎨 Better UX/UI

🤝 Contributing

Contributions are welcome! If you'd like to contribute to Simba, please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Commit your changes with clear messages.
Open a pull request describing your changes.

💬 Support & Contact

For support or inquiries, please open an issue 📌 on GitHub or contact repo owner at Hamza Zerouali

gittech. site