Get Started
Recommended first workflow
Start with one clean workspace and a small but representative corpus. The first decision is not “which clever question should I ask?” It is “should this workspace behave like a persistent wiki or like a retrieval engine?” Once that is clear, the rest of the system becomes much easier to operate and explain.
- Choose the workspace mode first: WikiLLM for durable memory, Retrieval for broad document recall.
- Create one clean workspace for one topic, client, project, or research corpus.
- Import a small but representative set of documents before scaling the corpus.
- Process documents so EigenVertex extracts text, OCR, transcription, and semantic metadata.
- If the workspace is Retrieval, chunk and index the corpus for lexical, vector, and graph search.
- If the workspace is WikiLLM, let the backend compile source, topic, entity, and concept pages.
- Use Chat only after the workspace has become readable: wiki pages for WikiLLM, indexed evidence for Retrieval.
- Run wiki lint and maintain in WikiLLM, or inspect evidence quality in Retrieval, before large-scale usage.
Create two workspaces with the same corpus: one in WikiLLM and one in Retrieval. Ask the same question in both. That reveals far more than importing hundreds of documents into one ambiguous environment.
Workspace Modes
Choose the mode before you ingest
A workspace now has a primary mode. This is not a cosmetic setting. It decides how ingestion, query, maintenance, and the console itself should behave.
WikiLLM
Recommended for small to medium curated corpora and long-lived knowledge work. The system reads raw sources once, compiles a persistent markdown wiki, maintains AGENTS.md, index.md, and log.md, then answers from that memory before rereading raw evidence.
Retrieval
Recommended for larger corpora, exploratory search, and source-first recall. The system processes documents into evidence layers, then answers through lexical, vector, and graph-oriented retrieval without pretending that a compiled wiki already exists.
If the value comes from cumulative understanding, choose WikiLLM. If the
value comes from broad evidence recall across a larger corpus, choose Retrieval.
Product Architecture
The three strategies
EigenVertex should not be described as a single “RAG stack”. It exposes three distinct strategies with different jobs.
Wiki-LLM
Persistent compiled memory. The wiki is the durable artifact. It accumulates useful knowledge over time, receives writeback after good answers, and becomes more valuable as the workspace matures.
Agentic RAG
Evidence-first retrieval and synthesis. The retrieval engine rereads the corpus, finds precise supporting passages, and synthesizes grounded answers when the question depends on raw evidence rather than previously compiled memory.
Graph-LLM
Relational reasoning over the corpus. The graph connects pages, concepts, claims, methods, and tensions so the system can navigate neighborhoods, surface contradictions, and explain why documents belong together.
Wiki-LLM answers “what do we know persistently?”, Agentic RAG answers “which passages support this right now?”, and Graph-LLM answers “how do these ideas connect?”.
Ingestion
Supported source types and ingestion flow
EigenVertex is designed for heterogeneous corpora. The same product can ingest written documents, media, expert notes, and generated artifacts while preserving provenance.
WikiLLM ingestion
After processing, the backend reads the source, updates durable wiki pages, refreshes index.md, and appends to log.md. It does not auto-chunk or
auto-index into Qdrant.
Retrieval ingestion
After processing, the backend builds evidence layers: chunks, lexical indexes, vector indexes, and graph material. This is the right path when recall and source-first search matter more than durable wiki maintenance.
Transcript-first YouTube
For YouTube URLs, EigenVertex now prefers captions and transcripts before falling back to audio transcription, which makes ingestion faster, cheaper, and easier to compile into usable wiki pages.
Large imports
A dry run is still recommended before importing hundreds of documents. URL imports are archived as snapshots when possible, and GitHub gists prefer the raw source view so the resulting document is cleaner and easier to retrieve.
Search & Chat
Chat options only make sense inside the right mode
The console should not present the same controls everywhere. WikiLLM and Retrieval do not have the same knobs.
WikiLLM chat
WikiLLM keeps the path intentionally simple: the system reads the wiki, synthesizes from persistent pages, and can write back durable answers. There is no meaningful choice between vector, hybrid, or graph retrieval in this mode.
Retrieval chat
Retrieval mode is where evidence strategies matter. The current console surface
should expose retrieval-oriented choices such as Vector and Graph, not wiki memory shortcuts.
Retrieval strategies
Vector
Default evidence-first retrieval. Use it when you want reliable citations, precise source passages, and predictable latency from indexed chunks.
Graph
Relational navigation and graph-aware synthesis. Use it when you want concept neighborhoods, relationship inspection, or a graph-shaped answer path. Graph-LLM complements, rather than replaces, vector retrieval.
Speed profile
- Fast
- API usage, product flows, and quick checks. Keeps the path short and grounded. In WikiLLM it means concise wiki synthesis. In Retrieval it means a shorter evidence path with lower orchestration cost.
- Balanced
- Default human chat and most serious corpus questions. The normal “work carefully” profile. It preserves grounding while allowing a fuller synthesis path and better coverage.
- Thorough
- Ambiguous, comparative, or high-value research questions. Expands coverage and depth. Use it when latency matters less than reading breadth, comparison quality, and reasoning depth.
Answer behavior
- Grounded (`strict` in API)
- Audits, compliance, factual checks, and corpus-only answers. The model should stay inside the retrieved or compiled evidence. If the workspace cannot support an answer, it should say so.
- RAG (`research` in API)
- Serious Q&A, exploration, and synthesis. The answer can derive a fuller synthesis from the evidence, including assumptions, limitations, and follow-up questions, while staying grounded.
- Builder (`build` in API)
- Code, algorithms, protocols, implementation planning, and constructive outputs. The model may propose concrete artifacts guided by the workspace, clearly separating what is supported from what is derived.
Native API
Use EigenVertex as an application backend
The native API is the richest integration surface. Use it when you need workspaces, documents, connectors, ingestion status, SSE progress, conversations, wiki maintenance, or retrieval controls.
Create a workspace with an explicit mode
export EVTX_BASE_URL="https://api.eigenvertex.com"
export EVTX_API_KEY="evtx_..."
curl -X POST "$EVTX_BASE_URL/v1/workspaces" \
-H "Authorization: Bearer $EVTX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Piano Research",
"slug": "piano-research",
"visibility": "private",
"workspace_mode": "wiki_llm"
}' WikiLLM chat turn
curl -X POST "$EVTX_BASE_URL/v1/conversations/CONVERSATION_ID/chat-turn" \
-H "Authorization: Bearer $EVTX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "Summarize EigenVertex in five points.",
"include_sources": true,
"save_messages": true,
"query_profile": "fast",
"answer_mode": "research"
}' Retrieval chat turn
curl -X POST "$EVTX_BASE_URL/v1/conversations/CONVERSATION_ID/chat-turn" \
-H "Authorization: Bearer $EVTX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "Quels algorithmes sont robustes pour détecter la f0 du piano ?",
"top_k": 8,
"include_sources": true,
"save_messages": true,
"query_profile": "balanced",
"answer_mode": "research",
"retrieval_strategy": "vector",
"retrieval_layers": ["vector"]
}' Wiki lint and maintain
curl -X POST "$EVTX_BASE_URL/v1/wiki/workspaces/WORKSPACE_ID/lint" \
-H "Authorization: Bearer $EVTX_API_KEY" \
-H "Content-Type: application/json" curl -X POST "$EVTX_BASE_URL/v1/wiki/workspaces/WORKSPACE_ID/maintain" \
-H "Authorization: Bearer $EVTX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"apply_safe_fixes": true
}' Python client example
import os
import requests
base_url = os.environ["EVTX_BASE_URL"]
api_key = os.environ["EVTX_API_KEY"]
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
payload = {
"question": "Compare YIN, SWIPE and parametric methods for piano f0 detection.",
"top_k": 10,
"include_sources": True,
"query_profile": "balanced",
"answer_mode": "research",
"retrieval_strategy": "vector",
"retrieval_layers": ["vector"],
}
response = requests.post(
f"{base_url}/v1/conversations/{os.environ['EVTX_CONVERSATION_ID']}/chat-turn",
headers=headers,
json=payload,
timeout=120,
)
response.raise_for_status()
data = response.json()
print(data["assistant_message"]["content"])
print(data["query_result"]["diagnostics"]) TypeScript client example
type QueryProfile = "fast" | "balanced" | "thorough";
type AnswerMode = "strict" | "research" | "build";
type RetrievalStrategy = "vector" | "graph";
export async function askEigenVertex(params: {
baseUrl: string;
apiKey: string;
conversationId: string;
question: string;
profile?: QueryProfile;
mode?: AnswerMode;
retrieval?: RetrievalStrategy;
}) {
const response = await fetch(
`${params.baseUrl}/v1/conversations/${params.conversationId}/chat-turn`,
{
method: "POST",
headers: {
Authorization: `Bearer ${params.apiKey}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
question: params.question,
top_k: 8,
include_sources: true,
save_messages: true,
query_profile: params.profile ?? "fast",
answer_mode: params.mode ?? "research",
retrieval_strategy: params.retrieval ?? "vector",
retrieval_layers: [params.retrieval ?? "vector"]
})
}
);
if (!response.ok) {
throw new Error(await response.text());
}
return response.json();
}OpenAI-compatible API
Consume EigenVertex like a model when that is easier
The OpenAI-compatible facade is useful when an existing application already uses OpenAI
SDKs. The request still reaches EigenVertex, but the client can speak the familiar /v1/chat/completions or /v1/responses language.
from openai import OpenAI
client = OpenAI(
api_key="evtx_...",
base_url="https://api.eigenvertex.com/v1",
)
response = client.chat.completions.create(
model="eigenvertex-grounded",
messages=[
{"role": "system", "content": "Answer in French with citations when available."},
{"role": "user", "content": "What does the corpus say about tuning stability?"}
],
extra_body={
"eigenvertex": {
"workspace_id": "WORKSPACE_ID",
"include_sources": True,
"answer_mode": "research",
"query_profile": "balanced"
}
}
)
print(response.choices[0].message.content)
print(response.model_extra.get("eigenvertex")) Use the native API for ingestion, workspace mode control, wiki maintenance, and operational workflows. Use the OpenAI-compatible API when your product wants EigenVertex to look like a chat model with extra grounding options.
Operations
What to monitor and maintain
EigenVertex exposes operational surfaces because large or valuable corpora are never “fire and forget”. The exact maintenance shape depends on the workspace mode.
- Ingestion SSE
- Watch active documents, current steps, progress percent, and recent workspace updates.
- WikiLLM operations
- Run
lintto diagnose the wiki andmaintainto apply safe repairs. This is where AGENTS.md, index.md, log.md, provenance, cross-links, and contradiction notes are kept healthy. - Retrieval evidence quality
- Inspect chunking, index readiness, and citation quality. Retrieval workspaces depend on the evidence layer staying readable and trustworthy.
- Deletion and cleanup
- Workspace and document deletion must clean database rows, object storage, wiki pages, and retrieval artifacts according to the workspace mode.
- Authors and provenance
- When available, authors and provenance should be preserved and displayed. This matters especially for scientific corpora where the value of a claim depends on who made it and where.
Bootstrap
Every WikiLLM workspace starts with AGENTS.md, index.md, and log.md. They define the contract, the catalog, and the chronological journal.
Ingest
Each processed source creates or updates durable pages such as source, topic, entity, and concept pages. The system refreshes index.md and appends to log.md.
Query writeback
Good durable answers can be filed back into the wiki as question or analysis pages instead of disappearing into chat history.
Lint and maintain
The wiki can be health-checked and then repaired through safe maintenance actions such as restoring missing sections, provenance, cross-links, and contradiction notes.
Deployment Models
Cloud EigenVertex now, on-prem path next
The production posture is to keep the backend stateless and keep Postgres, Qdrant, and S3-compatible storage as separate services. This keeps cloud and self-hosted deployment cleaner and makes the split between WikiLLM and Retrieval easier to operate.
Cloud EigenVertex
EigenVertex operates the backend and service dependencies. Teams consume the console and API through API keys, CORS configuration, and provider-level controls.
On-premises / self-host
The client deploys the backend on its own infrastructure and connects it to its own Postgres, Qdrant, and object storage. WikiLLM can stay Postgres-centered, while Retrieval keeps its larger evidence stack.
APP_ENV=production
APP_REQUIRE_API_KEY=true
APP_CORS_ALLOW_ORIGINS=https://labs.eigenvertex.com
DATABASE_URL=postgresql+psycopg://eigenvertex:***@postgres:5432/eigenvertex
QDRANT_URL=http://qdrant:6333
S3_ENDPOINT_URL=http://minio:9000
S3_BUCKET=eigenvertex-documents
OPENAI_API_KEY=...
GEMINI_API_KEY=...
MISTRAL_API_KEY=...
LLAMA_API_KEY=... Do not bake Postgres, Qdrant, or object storage into the backend image. They are persistent data services with their own backup, upgrade, and security lifecycle.