This README.md is designed to reflect the sophisticated local RAG pipeline you've built, highlighting the multi-threaded enrichment and the DSPy-powered "Smart Retrieval" system.

🐉 DnD Campaign Oracle: Local RAG Assistant

An advanced Retrieval-Augmented Generation (RAG) system designed for Dungeon Masters. This tool ingests markdown-based campaign notes, enriches them with AI-generated metadata, and provides an interactive terminal interface to query your world’s lore using DSPy and Local LLMs.

⚔️ Key Features

Parallel Enrichment: Utilizes a ThreadPoolExecutor to process multiple document chunks simultaneously across local LLM slots for high-speed ingestion.
Structured Metadata: Uses DSPy TypedPredictors and Pydantic to force LLMs to output valid JSON synopses, tags, and entity lists.
Deep Context Retrieval: Unlike standard RAG, this system retrieves relevant chunks and then "peeks" at the full source file to provide the LLM with broader narrative context.
Local-First: Designed to run entirely on your hardware using LM Studio and FAISS, keeping your campaign secrets private.

🏗️ Architecture

Ingestion: Scans DATA_DIR for .md files.
Chunking: Splits documents into 800-character segments with overlap.
Enrichment: A DSPy IngestionAgent analyzes each chunk to extract:

Synopsis: A one-sentence summary.
Tags: Plot points, item names, or themes.
Entities: Specific NPCs, Locations, or Factions.

Vector Store: Chunks and metadata are embedded using text-embedding-qwen3 and stored in a local FAISS index.
Interactive RAG: A terminal loop that uses Chain of Thought (CoT) reasoning to answer queries based on retrieved context.

🛠️ Setup

Prerequisites

Python 3.10+
LM Studio: Running a local server at http://192.168.0.49:1234 (or your specific IP).
Models: * Inference: qwen3-8b (or similar).
Embedding: text-embedding-qwen3-embedding-8b.

Installation

uv sync

🚀 Usage

1. Ingest & Enrich

Run the ingestion script to process your markdown files and build the vector database.

uv run src/ingest.py

2. Query the Oracle

Launch the interactive session to ask questions about your campaign.

uv run src/retrieve.py

Example Query:

📝 Query: Why did the party get free bread at the Golden Grain Inn? 📜 AI RESPONSE: Based on the session notes from 'Session_12.md', the party received free bread because the Rogue successfully intimidated the baker's assistant, and the Cleric later performed a minor miracle (Thaumaturgy) that impressed the owner.

📂 File Structure

ingest.py: Handles file loading, multi-threaded enrichment, and FAISS storage.
retrieve.py: The interactive terminal-based retrieval loop.
experts/ingestion_agent.py: Contains the IngestionAgent and Pydantic schemas.
embedding.py: Custom wrapper for LocalLMEmbeddings with batch processing support.
local_faiss_db/: Directory where the vector index and metadata are persisted.

⚙️ Configuration

In ingest_notes.py, you can tune the processing speed:

max_workers=8: Adjust based on your GPU/CPU capability to handle concurrent LLM requests.
chunk_size=800: Increase for more context per chunk, decrease for more granular searching.

README.md Unescape Escape