Files
dungeon_masters_vault/README.md
T
2026-01-27 22:04:31 +00:00

98 lines
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This `README.md` is designed to reflect the sophisticated local RAG pipeline you've built, highlighting the multi-threaded enrichment and the DSPy-powered "Smart Retrieval" system.
---
# 🐉 DnD Campaign Oracle: Local RAG Assistant
An advanced Retrieval-Augmented Generation (RAG) system designed for Dungeon Masters. This tool ingests markdown-based campaign notes, enriches them with AI-generated metadata, and provides an interactive terminal interface to query your worlds lore using **DSPy** and **Local LLMs**.
## ⚔️ Key Features
* **Parallel Enrichment:** Utilizes a `ThreadPoolExecutor` to process multiple document chunks simultaneously across local LLM slots for high-speed ingestion.
* **Structured Metadata:** Uses **DSPy TypedPredictors** and **Pydantic** to force LLMs to output valid JSON synopses, tags, and entity lists.
* **Deep Context Retrieval:** Unlike standard RAG, this system retrieves relevant chunks and then "peeks" at the full source file to provide the LLM with broader narrative context.
* **Local-First:** Designed to run entirely on your hardware using **LM Studio** and **FAISS**, keeping your campaign secrets private.
---
## 🏗️ Architecture
1. **Ingestion:** Scans `DATA_DIR` for `.md` files.
2. **Chunking:** Splits documents into 800-character segments with overlap.
3. **Enrichment:** A DSPy `IngestionAgent` analyzes each chunk to extract:
* **Synopsis:** A one-sentence summary.
* **Tags:** Plot points, item names, or themes.
* **Entities:** Specific NPCs, Locations, or Factions.
4. **Vector Store:** Chunks and metadata are embedded using `text-embedding-qwen3` and stored in a local **FAISS** index.
5. **Interactive RAG:** A terminal loop that uses **Chain of Thought (CoT)** reasoning to answer queries based on retrieved context.
---
## 🛠️ Setup
### Prerequisites
* **Python 3.10+**
* **LM Studio:** Running a local server at `http://192.168.0.49:1234` (or your specific IP).
* **Models:** * Inference: `qwen3-8b` (or similar).
* Embedding: `text-embedding-qwen3-embedding-8b`.
### Installation
```bash
uv sync
```
---
## 🚀 Usage
### 1. Ingest & Enrich
Run the ingestion script to process your markdown files and build the vector database.
```bash
uv run src/ingest.py
```
### 2. Query the Oracle
Launch the interactive session to ask questions about your campaign.
```bash
uv run src/retrieve.py
```
**Example Query:**
> `📝 Query: Why did the party get free bread at the Golden Grain Inn?`
> `📜 AI RESPONSE: Based on the session notes from 'Session_12.md', the party received free bread because the Rogue successfully intimidated the baker's assistant, and the Cleric later performed a minor miracle (Thaumaturgy) that impressed the owner.`
---
## 📂 File Structure
* `ingest.py`: Handles file loading, multi-threaded enrichment, and FAISS storage.
* `retrieve.py`: The interactive terminal-based retrieval loop.
* `experts/ingestion_agent.py`: Contains the `IngestionAgent` and Pydantic schemas.
* `embedding.py`: Custom wrapper for `LocalLMEmbeddings` with batch processing support.
* `local_faiss_db/`: Directory where the vector index and metadata are persisted.
---
## ⚙️ Configuration
In `ingest_notes.py`, you can tune the processing speed:
* `max_workers=8`: Adjust based on your GPU/CPU capability to handle concurrent LLM requests.
* `chunk_size=800`: Increase for more context per chunk, decrease for more granular searching.
---