This README.md is designed to reflect the sophisticated local RAG pipeline you've built, highlighting the multi-threaded enrichment and the DSPy-powered "Smart Retrieval" system.
🐉 DnD Campaign Oracle: Local RAG Assistant
An advanced Retrieval-Augmented Generation (RAG) system designed for Dungeon Masters. This tool ingests markdown-based campaign notes, enriches them with AI-generated metadata, and provides an interactive terminal interface to query your world’s lore using DSPy and Local LLMs.
⚔️ Key Features
- Parallel Enrichment: Utilizes a
ThreadPoolExecutorto process multiple document chunks simultaneously across local LLM slots for high-speed ingestion. - Structured Metadata: Uses DSPy TypedPredictors and Pydantic to force LLMs to output valid JSON synopses, tags, and entity lists.
- Deep Context Retrieval: Unlike standard RAG, this system retrieves relevant chunks and then "peeks" at the full source file to provide the LLM with broader narrative context.
- Local-First: Designed to run entirely on your hardware using LM Studio and FAISS, keeping your campaign secrets private.
🏗️ Architecture
- Ingestion: Scans
DATA_DIRfor.mdfiles. - Chunking: Splits documents into 800-character segments with overlap.
- Enrichment: A DSPy
IngestionAgentanalyzes each chunk to extract:
- Synopsis: A one-sentence summary.
- Tags: Plot points, item names, or themes.
- Entities: Specific NPCs, Locations, or Factions.
- Vector Store: Chunks and metadata are embedded using
text-embedding-qwen3and stored in a local FAISS index. - Interactive RAG: A terminal loop that uses Chain of Thought (CoT) reasoning to answer queries based on retrieved context.
🛠️ Setup
Prerequisites
- Python 3.10+
- LM Studio: Running a local server at
http://192.168.0.49:1234(or your specific IP). - Models: * Inference:
qwen3-8b(or similar). - Embedding:
text-embedding-qwen3-embedding-8b.
Installation
uv sync
🚀 Usage
1. Ingest & Enrich
Run the ingestion script to process your markdown files and build the vector database.
uv run src/ingest.py
2. Query the Oracle
Launch the interactive session to ask questions about your campaign.
uv run src/retrieve.py
Example Query:
📝 Query: Why did the party get free bread at the Golden Grain Inn?📜 AI RESPONSE: Based on the session notes from 'Session_12.md', the party received free bread because the Rogue successfully intimidated the baker's assistant, and the Cleric later performed a minor miracle (Thaumaturgy) that impressed the owner.
📂 File Structure
ingest.py: Handles file loading, multi-threaded enrichment, and FAISS storage.retrieve.py: The interactive terminal-based retrieval loop.experts/ingestion_agent.py: Contains theIngestionAgentand Pydantic schemas.embedding.py: Custom wrapper forLocalLMEmbeddingswith batch processing support.local_faiss_db/: Directory where the vector index and metadata are persisted.
⚙️ Configuration
In ingest_notes.py, you can tune the processing speed:
max_workers=8: Adjust based on your GPU/CPU capability to handle concurrent LLM requests.chunk_size=800: Increase for more context per chunk, decrease for more granular searching.