From 5076486935553ae72ae40dd0b4a00cecc653d86b Mon Sep 17 00:00:00 2001 From: Clawdbot Date: Thu, 12 Feb 2026 13:34:08 +1100 Subject: [PATCH] docs: update README for Postgres+pgvector and add ingestion TODOs --- README.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index cbca63a..797ee9d 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,11 @@ # knowledge-mcp -A Model Context Protocol (MCP) server that provides scoped RAG workspaces ("Notebooks") backed by **Qdrant** and **TEI**. +A Model Context Protocol (MCP) server that provides scoped RAG workspaces ("Notebooks") backed by **Postgres + pgvector** and **TEI**. ## Overview This server enables an agent to: -1. Create named "Notebooks" (Qdrant Collections). +1. Create named "Notebooks" (Postgres-backed collections). 2. Ingest documents (PDF, Markdown, Text) into specific notebooks. 3. Query specific notebooks using vector search (RAG). 4. Synthesize findings across a notebook. @@ -15,26 +15,32 @@ Designed to replicate the **NotebookLM** experience: clean, focused, bounded con ## Stack * **Language:** Python 3.11+ * **Framework:** `mcp` SDK -* **Vector DB:** Qdrant +* **Vector DB:** Postgres + pgvector * **Embeddings:** Text Embeddings Inference (TEI) - `BAAI/bge-base-en-v1.5` ## Tools -### `notebook.create` -Creates a new isolated workspace (Qdrant Collection). +### `create_notebook` +Creates a new isolated workspace (Postgres-backed notebook). - `name`: string (e.g., "project-alpha") -### `notebook.add_source` +### `add_source` Ingests a document into the notebook. - `notebook`: string -- `url`: string (URL or local path) +- `content`: string (raw text or local path) +- `source_name`: string +- `format`: `text` or `pdf_path` -### `notebook.query` +### `query_notebook` Performs a semantic search/RAG generation against the notebook. - `notebook`: string - `query`: string ## Configuration Env vars: -- `QDRANT_URL`: URL to Qdrant (e.g., `http://qdrant.openshift-gitops.svc:6333`) +- `DATABASE_URL`: Postgres connection string (e.g., `postgresql://postgres:password@postgres.knowledge-mcp.svc:5432/knowledge`) - `TEI_URL`: URL to TEI (e.g., `http://text-embeddings.tei.svc.cluster.local:8080`) + +## TODO +- Add PDF → Markdown/text conversion step to improve extraction quality. +- Add OCR pipeline for scanned PDFs.