<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Chaos Cypher Blog</title>
        <link>https://chaoscypher.com/blog</link>
        <description>Chaos Cypher Blog</description>
        <lastBuildDate>Thu, 12 Mar 2026 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <copyright>Copyright 2026 Chaos Cypher, Inc.</copyright>
        <item>
            <title><![CDATA[Extract Smarter: How Domain-Aware AI Builds Better Knowledge Graphs]]></title>
            <link>https://chaoscypher.com/blog/domain-extraction-guide</link>
            <guid>https://chaoscypher.com/blog/domain-extraction-guide</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[How domain-specific extraction in Chaos Cypher produces typed entities and meaningful relationships instead of generic, queryless knowledge graphs.]]></description>
            <content:encoded><![CDATA[<p>Most AI extraction tools treat every document the same way. Upload a medical paper or a legal contract and you get the same generic entity types, the same vague relationships, the same disappointing graph. Chaos Cypher takes a different approach: it detects what kind of document you uploaded and adapts its entire extraction pipeline to match.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-problem-with-generic-extraction">The Problem with Generic Extraction<a href="https://chaoscypher.com/blog/domain-extraction-guide#the-problem-with-generic-extraction" class="hash-link" aria-label="Direct link to The Problem with Generic Extraction" title="Direct link to The Problem with Generic Extraction" translate="no">​</a></h2>
<p>Here's a sentence you might find in a clinical document:</p>
<blockquote>
<p>Patient with hypertension started on lisinopril 10mg daily. The ACE inhibitor is contraindicated with potassium supplements. Side effects include dry cough and dizziness.</p>
</blockquote>
<p>A generic extraction pipeline -- the kind most tools use -- will pull out a handful of entities and connect them with whatever relationship labels the LLM feels like inventing. You might get "Lisinopril" typed as an <strong>Item</strong>, "Hypertension" as a <strong>Concept</strong>, and "Dry Cough" as another <strong>Concept</strong>. The relationships between them? Probably <code>related_to</code> and <code>influences</code>. Maybe <code>associated_with</code> if you are lucky.</p>
<p>This is the "garbage in, garbage out" of knowledge graphs. It's not that the AI failed to read the text. It read it fine. The problem is that nobody told it what to look for, what types are valid, or what the relationships between those types should mean.</p>
<p>The graph you get is technically correct and practically useless. You cannot query "which drugs treat hypertension" because the system does not know what a Drug is. You cannot find contraindications because <code>related_to</code> could mean anything. Every edge in the graph carries the same semantic weight as a shrug.</p>
<p>Now run the same sentence through Chaos Cypher with the <strong>medical</strong> domain active:</p>
<ul>
<li class=""><strong>Lisinopril</strong> becomes a <strong>Drug</strong> with dosage form and mechanism of action as properties</li>
<li class=""><strong>Hypertension</strong> becomes a <strong>Condition</strong></li>
<li class=""><strong>Dry Cough</strong> and <strong>Dizziness</strong> become <strong>Side Effects</strong></li>
<li class=""><strong>Potassium Supplements</strong> gets recognized as a <strong>Drug</strong> (because supplements have drug interactions too)</li>
</ul>
<p>The relationships are just as precise: <code>treats</code>, <code>contraindicated_with</code>, <code>produces_side_effect</code>. Each one is typed, directional, and constrained. Only a Drug or Treatment can <code>treat</code> a Condition. Only a Drug can <code>produce_side_effect</code> on a Side Effect. The LLM isn't guessing -- it's following a schema.</p>
<p>That's what domain-aware extraction does. It turns a language model from a general-purpose pattern matcher into a domain specialist.</p>
<p><img decoding="async" loading="lazy" alt="Source detail showing entity and relationship distribution charts" src="https://chaoscypher.com/assets/images/source-detail-overview-7f510a7f654298be51ff1ddcaa646260.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="how-it-works-upload-to-knowledge-graph">How It Works: Upload to Knowledge Graph<a href="https://chaoscypher.com/blog/domain-extraction-guide#how-it-works-upload-to-knowledge-graph" class="hash-link" aria-label="Direct link to How It Works: Upload to Knowledge Graph" title="Direct link to How It Works: Upload to Knowledge Graph" translate="no">​</a></h2>
<p>The workflow is straightforward. You upload a document. Chaos Cypher figures out what domain it belongs to, loads the right extraction rules, and runs the pipeline. You don't need to configure anything upfront -- though you can override the detected domain if you want.</p>
<p>Here's what happens behind the scenes:</p>
<ol>
<li class="">
<p><strong>Detection</strong> -- Chaos Cypher samples the first few thousand characters of your document and scores it against all registered domains simultaneously. Each domain has weighted keyword groups, regex patterns, and file type signals. The highest-scoring domain wins.</p>
</li>
<li class="">
<p><strong>Guidance injection</strong> -- The winning domain's extraction rules get injected into the LLM prompt. This includes entity type definitions, relationship constraints, exclusion rules (what <em>not</em> to extract), and worked examples of correct extractions.</p>
</li>
<li class="">
<p><strong>Strict type enforcement</strong> -- The LLM is instructed to only use entity types from the domain's template list. After extraction, a code-level filter drops any entity whose type does not match a known template. No hallucinated types survive.</p>
</li>
<li class="">
<p><strong>Relationship validation</strong> -- Each relationship is checked against source/target type constraints. A <code>treats</code> relationship must flow from a Drug, Treatment, or Procedure to a Condition or Symptom. Anything else gets rejected.</p>
</li>
<li class="">
<p><strong>Quality scoring</strong> -- Extracted entities and relationships are scored by domain relevance. Domain-specific types like Drug and Condition score higher than generic fallbacks. This surfaces the most valuable parts of your graph.</p>
</li>
</ol>
<p><img decoding="async" loading="lazy" alt="Add Source dialog with URL input and file drag-and-drop" src="https://chaoscypher.com/assets/images/sources-upload-dialog-9bab33b9d1820d6fa689f2f595f8562d.png" width="1280" height="800" class="img_ev3q"></p>
<p>Chaos Cypher ships with <strong>16 built-in domains</strong>, each tuned for a different category of document:</p>
<table><thead><tr><th>Domain</th><th>Typical Entity Types</th><th>Best For</th></tr></thead><tbody><tr><td><strong>Biographical</strong></td><td>Person, Life Event, Achievement, Relationship</td><td>Biographies, memoirs, personal histories</td></tr><tr><td><strong>Cybersecurity</strong></td><td>Threat Actor, Vulnerability, Malware, Attack Technique</td><td>Threat intel, incident reports, CVE research</td></tr><tr><td><strong>Educational</strong></td><td>Course, Learning Objective, Concept, Assessment</td><td>Textbooks, curricula, instructional materials</td></tr><tr><td><strong>Financial</strong></td><td>Company, Financial Instrument, Market Event, Regulation</td><td>Earnings reports, market analysis, SEC filings</td></tr><tr><td><strong>Generic</strong></td><td>Person, Organization, Event, Concept, Location</td><td>General-purpose fallback for any content</td></tr><tr><td><strong>Historical</strong></td><td>Historical Figure, Event, Treaty, Dynasty, Territory</td><td>Primary sources, historiography, timelines</td></tr><tr><td><strong>Investigation</strong></td><td>Suspect, Evidence, Witness, Case, Incident</td><td>Criminal/civil investigations, case files, forensics</td></tr><tr><td><strong>Legal</strong></td><td>Statute, Case, Party, Obligation, Legal Principle</td><td>Contracts, court opinions, regulatory filings</td></tr><tr><td><strong>Literary</strong></td><td>Character, Setting, Theme, Plot Element</td><td>Novels, poetry, drama, literary criticism</td></tr><tr><td><strong>Medical</strong></td><td>Drug, Condition, Symptom, Procedure, Side Effect</td><td>Clinical documents, pharmaceutical literature</td></tr><tr><td><strong>News</strong></td><td>Person, Organization, Event, Statement, Policy</td><td>News articles, press releases, journalism</td></tr><tr><td><strong>Philosophical</strong></td><td>Philosopher, Argument, Concept, School of Thought</td><td>Philosophy texts across global traditions</td></tr><tr><td><strong>Political</strong></td><td>Political Entity, Policy, Election, Legislation</td><td>Governance docs, political theory, policy analysis</td></tr><tr><td><strong>Scientific</strong></td><td>Hypothesis, Method, Finding, Dataset, Organism</td><td>Research papers, experiments, academic publications</td></tr><tr><td><strong>Technical</strong></td><td>Module, Class, Function, Endpoint, Design Pattern</td><td>API docs, codebases, technical specifications</td></tr><tr><td><strong>Theological</strong></td><td>Deity, Scripture, Doctrine, Ritual, Religious Figure</td><td>Sacred texts, theology, comparative religion</td></tr></tbody></table>
<p>Every domain uses strict entity type enforcement by default. The medical domain defines 17 entity types. The technical domain has 14. These aren't suggestions -- they're the only types the LLM is allowed to produce. That constraint is what separates a clean, queryable graph from a noisy soup of ad-hoc labels.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="under-the-hood-domain-detection-and-extraction-quality">Under the Hood: Domain Detection and Extraction Quality<a href="https://chaoscypher.com/blog/domain-extraction-guide#under-the-hood-domain-detection-and-extraction-quality" class="hash-link" aria-label="Direct link to Under the Hood: Domain Detection and Extraction Quality" title="Direct link to Under the Hood: Domain Detection and Extraction Quality" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="how-detection-works">How Detection Works<a href="https://chaoscypher.com/blog/domain-extraction-guide#how-detection-works" class="hash-link" aria-label="Direct link to How Detection Works" title="Direct link to How Detection Works" translate="no">​</a></h3>
<p>Domain detection runs a scoring algorithm across all registered domains simultaneously. Each domain defines its detection rules in a JSON-LD config file with three signal types:</p>
<p><strong>Weighted keyword groups.</strong> The medical domain has six keyword groups: <code>clinical_core</code> (weight 1.2), <code>pharmaceutical</code> (weight 1.0), <code>diagnostic</code> (weight 0.9), <code>anatomy</code> (weight 0.8), <code>procedures</code> (weight 0.9), and <code>clinical_terms</code> (weight 0.8). Each keyword match boosts the confidence score by <code>per_keyword_boost * weight</code>. A document full of "diagnosis", "treatment", and "symptoms" racks up points fast in the clinical_core group, while scattered mentions of "cardiac" and "pulmonary" add smaller anatomy-weighted boosts.</p>
<p><strong>Regex patterns.</strong> Keywords catch common terms, but patterns catch domain-specific notation. The medical domain matches dosage expressions like <code>\d+\s*(mg|mcg|ml)</code>, ICD codes like <code>ICD-10:J45</code>, and prescription abbreviations like <code>b.i.d.</code> and <code>p.r.n.</code>. Each pattern match carries its own weight -- dosage notation at 1.4x, ICD codes at 1.5x. A single ICD code in a document is a strong medical signal.</p>
<p><strong>File and document type signals.</strong> File extensions (<code>.py</code> for technical) and document type metadata (<code>medical_document</code>, <code>openapi</code>) provide additional boosts.</p>
<p>The final confidence score is compared against a per-domain minimum threshold. Medical requires 0.4 minimum confidence. The generic domain has a threshold of 0.0 -- it always matches as a fallback, but with the lowest possible score (0.1), so any specialized domain that passes its threshold will win.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="how-domains-shape-extraction-quality">How Domains Shape Extraction Quality<a href="https://chaoscypher.com/blog/domain-extraction-guide#how-domains-shape-extraction-quality" class="hash-link" aria-label="Direct link to How Domains Shape Extraction Quality" title="Direct link to How Domains Shape Extraction Quality" translate="no">​</a></h3>
<p>Detection picks the right domain. But the real value is in what happens next -- how the selected domain controls the extraction pipeline.</p>
<p><strong>Entity guidance tells the LLM what to extract and what to skip.</strong> The medical domain instructs: "Extract conditions, symptoms, treatments, drugs, procedures, and anatomical locations. Include dosage information as properties on drug entities." It also lists explicit exclusion rules: don't extract dosage numbers alone ("500mg" is a property of a Drug, not a standalone entity), don't extract study references ("Figure 1", "Table 2"), don't extract administrative codes as entities.</p>
<p><strong>Strict type enforcement prevents hallucinated types.</strong> When strict mode is on -- and it is on for all 14 specialized domains -- the LLM receives a closed list of valid entity types. The medical domain allows exactly 17 types: Condition, Symptom, Treatment, Drug, Procedure, Diagnostic Test, Anatomy, Pathogen, Clinical Trial, Dosage, Side Effect, Risk Factor, Gene/Biomarker, Patient Population, Guideline/Protocol, Outcome/Endpoint, and Mechanism of Action. Anything the LLM produces outside that list gets dropped in post-processing. No more "Medical Concept" or "Health Thing" cluttering your graph.</p>
<p><strong>Relationship constraints validate source and target combinations.</strong> The medical domain's <code>treats</code> relationship is constrained: source must be Drug, Treatment, or Procedure; target must be Condition or Symptom. If the LLM tries to say a Symptom <code>treats</code> a Drug, the relationship fails validation. This catches the most common extraction error -- reversed or semantically nonsensical edges.</p>
<p><strong>Compatibility groups enable smart deduplication.</strong> When the same entity appears in different chunks with slightly different types -- "Hypertension" as a Condition in one chunk and as a "Medical Concept" in another -- the compatibility groups determine whether they can be merged. In the medical domain, Condition and Symptom share the <code>clinical</code> group, so they are merge-eligible. Drug, Treatment, and Procedure share the <code>treatment</code> group. This prevents duplicate entities without losing type precision.</p>
<p><strong>Property type mapping rescues mistyped entities.</strong> Sometimes the LLM extracts "Severity" as a standalone entity when it should be a property on a Condition. The medical domain's property mapping knows that "Severity" should be absorbed into Condition as a <code>severity</code> property, and "Mechanism" into Drug as a <code>mechanism</code> property. Instead of cluttering the graph with orphaned attribute nodes, they get folded into the right place.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="try-it-yourself">Try It Yourself<a href="https://chaoscypher.com/blog/domain-extraction-guide#try-it-yourself" class="hash-link" aria-label="Direct link to Try It Yourself" title="Direct link to Try It Yourself" translate="no">​</a></h2>
<p>Every built-in domain is just a JSON-LD file. No Python, no compilation, no framework code. If you need a domain for your field that doesn't exist yet, you can create one in about 20 minutes.</p>
<p>Let's build a <strong>startup</strong> domain for analyzing pitch decks, funding announcements, and tech industry news.</p>
<p>Create a file called <code>startup.jsonld</code> in your <code>data/plugins/domains/</code> directory:</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"@context"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"@vocab"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"https://chaoscypher.io/schema/domain#"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"schema"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"https://schema.org/"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"schema:name"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"schema:description"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"@type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"ExtractionDomain"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"@id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"domain:startup"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"version"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"1.0.0"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Startup ecosystem: funding, founders, products, and acquisitions"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"strict_entity_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">true</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"detection"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"keywords"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"funding"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"terms"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"series A"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"series B"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"seed round"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"venture capital"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">                  </span><span class="token string" style="color:#39ff14">"valuation"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"fundraise"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"runway"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"cap table"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"weight"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">1.3</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"ecosystem"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"terms"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"startup"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"founder"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"co-founder"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"incubator"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">                  </span><span class="token string" style="color:#39ff14">"accelerator"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"pivot"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"product-market fit"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"MVP"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"weight"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">1.1</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"patterns"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"regex"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"\\$\\d+[MBK]\\s+(seed|series|round|valuation)"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"weight"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">1.5</span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"regex"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"(?i)Y Combinator|Techstars|500 Startups"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"weight"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">1.3</span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"confidence"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"base_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.2</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"per_keyword_boost"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.05</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"pattern_boost"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.15</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"min_threshold"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.4</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"entity_guidance"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Extract companies, founders, investors, funding rounds, and products. Attach dollar amounts and dates as properties on Funding Round entities, not as standalone entities."</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"templates"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"node_templates"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_company"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"A startup, corporation, or business entity"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"requires_named_referent"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">true</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"quality_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">25</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"properties"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"stage"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"display_name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Stage"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"property_type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"text"</span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"industry"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"display_name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Industry"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"property_type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"text"</span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_person"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Founder"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"A founder, co-founder, or key executive"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"requires_named_referent"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">true</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"quality_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">25</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_investor"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Investor"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"A VC firm, angel investor, or investment entity"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"requires_named_referent"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">true</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"quality_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">25</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_round"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Funding Round"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"A specific funding event (seed, Series A, etc.)"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"requires_named_referent"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">false</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"quality_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">25</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"properties"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"amount"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"display_name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Amount"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"property_type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"text"</span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token punctuation" style="color:#808098">{</span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"date"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"display_name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Date"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"property_type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"date"</span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_product"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Product"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"A software product, platform, or service"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"requires_named_referent"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token boolean" style="color:#ff6d00">true</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"quality_score"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">18</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"edge_templates"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_founded_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"founded_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Company was founded by a person"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"inverse"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"founded"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"source_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"target_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Founder"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_invested_in"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"invested_in"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Investor participated in a funding round"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"inverse"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"funded_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"source_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Investor"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"target_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Funding Round"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_raised"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"raised"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Company raised a funding round"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"inverse"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"round_for"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"source_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"target_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Funding Round"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_acquired_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"acquired_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Company was acquired by another company"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"inverse"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"acquired"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"source_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"target_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"startup_builds"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"builds"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Company builds or maintains a product"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"inverse"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"built_by"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"source_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Company"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"> </span><span class="token property" style="color:#c8c8e0">"target_types"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"Product"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain"></span><span class="token punctuation" style="color:#808098">}</span><br></div></code></pre></div></div>
<p>A few things to notice about this file:</p>
<p><strong>The <code>detection</code> section</strong> defines how Chaos Cypher recognizes startup content. The keyword groups are weighted -- "series A" and "venture capital" in the <code>funding</code> group carry more weight (1.3x) than general ecosystem terms (1.1x). The regex patterns catch dollar-amount-plus-round expressions like "$50M Series B" at 1.5x weight. These signals stack: a pitch deck mentioning several funding terms and a dollar figure will score well above the 0.4 threshold.</p>
<p><strong>The <code>templates</code> section</strong> defines the vocabulary. Five entity types, five relationship types. Each entity template has an <code>id</code>, a <code>name</code> (the type label that appears in the graph), and a <code>description</code> that helps the LLM understand what qualifies. The <code>requires_named_referent</code> flag tells the system whether an entity needs a proper name -- a Company does, but a Funding Round does not (it can be "Series A round" or just "the seed round"). Properties like <code>amount</code> and <code>stage</code> get attached to entities rather than floating as separate nodes.</p>
<p><strong>The <code>edge_templates</code></strong> constrain which entity types can appear on each side of a relationship. <code>founded_by</code> only flows from Company to Founder. <code>invested_in</code> only flows from Investor to Funding Round. The <code>inverse</code> field defines the reverse label for bidirectional traversal.</p>
<p>Restart Chaos Cypher and your domain is live. Upload a TechCrunch article or a pitch deck and watch the detection engine pick it up. Your custom entity types appear in the graph, constrained by the relationships you defined.</p>
<p><img decoding="async" loading="lazy" alt="Source extraction view showing domain-specific entity types" src="https://chaoscypher.com/assets/images/source-extraction-entities-0087201044ca156468721e42753c285c.png" width="1280" height="800" class="img_ev3q"></p>
<p>If you need to go deeper, the built-in domains show what else is possible: normalization keywords that fix LLM type inconsistencies, compatibility groups for smart deduplication, property type mappings that absorb mistyped entities, alias examples that teach the LLM about synonym handling, and extraction limits that tune relationship density. The medical domain is the most comprehensive example -- it defines 17 entity types, 20 relationship types, dosage regex patterns, ICD code detection, and evidence validation in strict mode. Study it when you want the full picture.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="whats-next">What's Next<a href="https://chaoscypher.com/blog/domain-extraction-guide#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next" translate="no">​</a></h2>
<p>We're planning more specialized domains -- supply chain, environmental science, and music theory are on the shortlist. But the real potential is in what users build. Every field has its own vocabulary, its own entity types, its own relationship patterns. A materials scientist cares about Crystal Structure, Synthesis Method, and Property. A genealogist needs Person, Family, Vital Record, and Census Entry. A cybersecurity analyst -- who already has a built-in domain -- might want to fork it and add types specific to their organization's threat model.</p>
<p>If you build a domain for your field, share it. A JSON-LD file is small, portable, and easy to review. Drop it in <code>data/plugins/domains/</code> and it works. No pull request required to use it, but we would love to include community domains in the built-in set for others to benefit from.</p>
<p>Domains work identically whether you're running <a class="" href="https://chaoscypher.com/blog/local-ai-knowledge-graph">locally with Ollama</a> or with a cloud provider. The domain system documentation covers the full JSON-LD schema, all available configuration options, and advanced features like extraction density tuning and evidence validation modes. Start with the five-entity example above, test it on your documents, and iterate from there.</p>]]></content:encoded>
            <category>tutorials</category>
            <category>ai</category>
        </item>
        <item>
            <title><![CDATA[Why Your RAG Chat is Missing Half the Answers (And How GraphRAG Fixes It)]]></title>
            <link>https://chaoscypher.com/blog/graphrag-enhanced-search</link>
            <guid>https://chaoscypher.com/blog/graphrag-enhanced-search</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Why vector-only RAG fails multi-hop questions — and how Chaos Cypher's GraphRAG fuses graph traversal with semantic search to find the full chain of evidence.]]></description>
            <content:encoded><![CDATA[<p>You upload four research papers to your RAG chatbot. You ask: "How does Dr. Chen's CRISPR research connect to the gene therapy trials at Stanford?" The chatbot thinks for a moment and gives you... a paragraph about CRISPR. Generic, shallow, pulled from whichever single chunk happened to mention the word. The actual answer -- that Chen published a paper on CRISPR delivery mechanisms, which was cited by a Stanford clinical trial for retinal gene therapy, which built on a funding collaboration between both institutions -- exists across three different documents. Your chatbot never even tried to find it.</p>
<p>This is the multi-hop problem, and it's the silent failure mode of every vector-only RAG system. Vector search embeds your question, compares it against document chunks, and returns the closest matches by cosine similarity. It works for single-hop questions: "What is CRISPR?" or "When did the Stanford trial begin?" But the moment an answer requires connecting information across documents -- following a citation chain, tracing a person through multiple sources, linking a cause in one report to an effect in another -- vector search falls apart. It can't follow relationships. It doesn't know that entities in different documents refer to the same thing. It just sees text.</p>
<p>The worst part: it fails silently. No error message, no "I couldn't find a complete answer." You get a confident-sounding response that happens to be shallow or wrong.</p>
<p>Chaos Cypher's GraphRAG search fixes this by fusing knowledge graph traversal with vector search. When you ask a multi-hop question, it walks the graph of entities and relationships extracted from your documents, finds structurally connected information you didn't ask about, retrieves the source passages that prove those connections, and merges everything into a single ranked result set. The answer you get isn't just semantically similar text. It's the actual chain of evidence.</p>
<p><img decoding="async" loading="lazy" alt="Search results showing entities with relevance scores and type badges" src="https://chaoscypher.com/assets/images/search-results-34db5bfb5d00adcfc2e3c7a3c2934ed2.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="what-happens-when-you-ask-a-multi-hop-question">What Happens When You Ask a Multi-Hop Question<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#what-happens-when-you-ask-a-multi-hop-question" class="hash-link" aria-label="Direct link to What Happens When You Ask a Multi-Hop Question" title="Direct link to What Happens When You Ask a Multi-Hop Question" translate="no">​</a></h2>
<p>Let's walk through a real scenario. You have uploaded three documents into Chaos Cypher: a research paper by Dr. Sarah Chen on CRISPR delivery vectors, a Stanford clinical trial report on retinal gene therapy, and a grant proposal connecting both institutions. You type into the chat: "How does Chen's CRISPR work relate to the Stanford gene therapy trial?"</p>
<p>Here's what happens behind the scenes, in seven steps.</p>
<p><strong>Step 1: Embed the query.</strong> Your question gets converted into a vector embedding -- the same starting point as any RAG system.</p>
<p><strong>Step 2: Match seed entities.</strong> Instead of immediately searching document chunks, GraphRAG first searches the knowledge graph. It finds entities whose embeddings are closest to your query vector. In this case, it matches "Dr. Sarah Chen" (a Person node) and "CRISPR delivery vectors" (a Concept node) as high-confidence seeds -- the anchor points for graph exploration.</p>
<p><strong>Step 3: Personalized PageRank.</strong> This is where it gets interesting. Standard PageRank finds globally important nodes. Personalized PageRank is different: it starts from your seed entities and performs a biased random walk through the graph. At each step, there is an 85% chance of following a relationship to a neighbor, and a 15% chance of teleporting back to a seed. Entities structurally close to your seeds get high scores, even if they were never mentioned in your query.</p>
<p>In our example, the algorithm discovers that "Dr. Sarah Chen" has a "published" relationship to "Lipid Nanoparticle Delivery Study," which has a "cited_by" edge pointing to "Stanford Retinal Gene Therapy Trial Phase II," which in turn has a "funded_by" connection to "NIH CRISPR Therapeutics Grant" -- a grant that also lists Chen as a co-investigator. None of these intermediate entities matched your query by text similarity. The graph surfaced them.</p>
<p><strong>Step 4: Assemble graph context.</strong> The top-scoring entities from PageRank are collected along with their relationships. This produces a structured context: seed entities you asked about, related entities the graph discovered, and the relationship triples connecting them. This context gets passed to the language model alongside the document chunks, giving it the structural "map" it needs to reason about connections.</p>
<p><strong>Step 5: Retrieve provenance chunks.</strong> The first of two parallel retrieval paths. For each entity the graph surfaced, GraphRAG looks up which document chunks those entities were originally extracted from. Chen was extracted from page 3 of the research paper. The Stanford trial came from the clinical report abstract. The funding connection came from page 12 of the grant proposal. These "provenance chunks" contain the actual evidence for the graph relationships.</p>
<p><strong>Step 6: Retrieve vector chunks.</strong> The second path runs simultaneously -- standard hybrid search (semantic + keyword) against all document chunks. It catches relevant passages that might not have generated graph entities but still contain useful context.</p>
<p><strong>Step 7: Merge and rank.</strong> The two paths produce two independently ranked lists. GraphRAG merges them using Reciprocal Rank Fusion, which combines rankings without normalizing scores across systems. Chunks appearing in both lists get a combined boost. The result is a single, deduplicated, ranked list of the most relevant passages across all your documents.</p>
<p>Instead of a shallow answer about CRISPR, you get the full chain: Chen's delivery mechanism research led to a cited clinical application at Stanford, connected through shared funding. The chat response includes both the graph context (discovered entities and relationships) and the document passages that prove those connections.</p>
<p><img decoding="async" loading="lazy" alt="Knowledge graph with search highlighting entity paths" src="https://chaoscypher.com/assets/images/graph-search-highlight-6d914ee83ab9d40dcf340fb2f3234d50.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="under-the-hood-technical-deep-dive">Under the Hood (Technical Deep-Dive)<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#under-the-hood-technical-deep-dive" class="hash-link" aria-label="Direct link to Under the Hood (Technical Deep-Dive)" title="Direct link to Under the Hood (Technical Deep-Dive)" translate="no">​</a></h2>
<p><em>This section is for developers who want to understand the algorithms. Skip ahead to "Try It Yourself" if you just want to use it.</em></p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="personalized-pagerank">Personalized PageRank<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#personalized-pagerank" class="hash-link" aria-label="Direct link to Personalized PageRank" title="Direct link to Personalized PageRank" translate="no">​</a></h3>
<p>Standard PageRank models a "random surfer" following links uniformly across a network. Personalized PageRank changes one thing: instead of teleporting to a random node, the surfer teleports back to seed nodes. This transforms a global importance metric into a query-specific relevance metric.</p>
<p>Chaos Cypher's implementation uses power iteration. Starting from scores concentrated on seed entities, it iteratively updates every node based on contributions from inbound neighbors, weighted by out-degree. The damping factor (0.85 default) controls the balance: higher values explore further from seeds; lower values keep scores tightly clustered.</p>
<p>Convergence is detected when the maximum score change drops below 1e-6, or after 100 iterations. In practice, most graphs converge in 15-30 iterations. The computation runs in-process with no external dependencies -- a graph of 10,000 nodes and 40,000 edges typically completes in under 100ms.</p>
<p>The seed weights come from vector similarity scores in Step 2. If "Dr. Sarah Chen" matched at 0.82 and "CRISPR delivery vectors" at 0.71, those scores become the personalization weights. The random walk isn't just seeded on the right entities -- it's biased toward the ones most relevant to your specific question.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="reciprocal-rank-fusion">Reciprocal Rank Fusion<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#reciprocal-rank-fusion" class="hash-link" aria-label="Direct link to Reciprocal Rank Fusion" title="Direct link to Reciprocal Rank Fusion" translate="no">​</a></h3>
<p>Provenance chunks have graph-connectivity scores. Vector chunks have cosine similarity scores. These aren't on the same scale, so you can't just sort by score.</p>
<p>RRF (Cormack, Clarke &amp; Butt, 2009) sidesteps this by ignoring scores entirely and using only rank positions. Each chunk's RRF score is the sum of <code>1 / (k + rank)</code> across all lists where it appears. The smoothing constant <code>k</code> (60, matching the original paper) dampens the advantage of being ranked first versus second.</p>
<p>The key property: chunks appearing in both lists get contributions from both, naturally boosting results validated by two independent signals. A chunk ranked 5th in provenance and 8th in vector search will often outrank one that is 1st in vector but absent from provenance. Evidence confirmed by graph structure is worth more than text similarity alone.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="graceful-degradation">Graceful Degradation<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#graceful-degradation" class="hash-link" aria-label="Direct link to Graceful Degradation" title="Direct link to Graceful Degradation" translate="no">​</a></h3>
<p>Not every database has a knowledge graph. Not every query matches graph entities. GraphRAG picks its operating mode automatically:</p>
<ul>
<li class=""><strong><code>full_graphrag</code></strong> -- Seeds found, PPR succeeded. Graph context + provenance chunks + vector chunks + RRF fusion.</li>
<li class=""><strong><code>vector_only</code></strong> -- Embeddings work but no graph seeds found. Standard hybrid search, no graph context.</li>
<li class=""><strong><code>keyword_only</code></strong> -- Embeddings unavailable. Pure SQLite FTS keyword search.</li>
</ul>
<p>The system never fails -- it always returns the best results it can. The retrieval stats in each response tell you exactly what happened: mode used, seeds found, entities explored, provenance versus vector chunk counts.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="tunable-parameters">Tunable Parameters<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#tunable-parameters" class="hash-link" aria-label="Direct link to Tunable Parameters" title="Direct link to Tunable Parameters" translate="no">​</a></h3>
<p>Six parameters in <code>settings.yaml</code> control the GraphRAG pipeline. The defaults work well for most databases, but here they are if you want to tune:</p>
<table><thead><tr><th>Parameter</th><th>Default</th><th>What It Controls</th></tr></thead><tbody><tr><td><code>seed_similarity_threshold</code></td><td>0.3</td><td>Minimum cosine similarity for a graph entity to qualify as a PPR seed. Lower values cast a wider net but may introduce noise.</td></tr><tr><td><code>ppr_top_k</code></td><td>20</td><td>Number of top-scoring entities from PageRank to include in graph context. Higher values give the LLM more structural context at the cost of token budget.</td></tr><tr><td><code>ppr_damping</code></td><td>0.85</td><td>PageRank damping factor. Higher means more exploration away from seeds. Lower keeps results closer to directly matched entities.</td></tr><tr><td><code>max_triples</code></td><td>200</td><td>Maximum relationship triples included in the graph context summary. Capped to avoid flooding the LLM context window.</td></tr><tr><td><code>vector_overfetch_multiplier</code></td><td>3</td><td>When searching for seed entities, fetch 3x the seed limit from the vector index to account for non-entity results (chunks) that need filtering.</td></tr><tr><td><code>max_graph_nodes</code></td><td>50,000</td><td>Safety limit. If your graph exceeds this, PPR is skipped (too expensive) and the system falls back to vector-only mode.</td></tr></tbody></table>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="try-it-yourself">Try It Yourself<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#try-it-yourself" class="hash-link" aria-label="Direct link to Try It Yourself" title="Direct link to Try It Yourself" translate="no">​</a></h2>
<p>Here's the good news: you don't need to configure anything. GraphRAG is the default search mode behind every chat conversation in Chaos Cypher. When you type a question, the chat system automatically calls <code>graphrag_search</code> as its first tool. If your database has extracted entities and embeddings, you get the full pipeline. If not, it degrades gracefully to vector or keyword search.</p>
<p>The simplest way to see it in action:</p>
<ol>
<li class="">
<p><strong>Upload 3-4 related documents.</strong> Pick sources that share entities -- research papers from the same field, chapters from the same book, reports about the same project. The key is overlap: the documents should reference some of the same people, organizations, concepts, or events.</p>
</li>
<li class="">
<p><strong>Wait for extraction to complete.</strong> Chaos Cypher will chunk the documents, generate embeddings (automatic), and then you can optionally run entity extraction to build the knowledge graph. The extraction step is what creates the graph nodes and edges that GraphRAG traverses. Without it, you still get vector-only search, which is fine -- but you miss the multi-hop connections.</p>
</li>
<li class="">
<p><strong>Ask a question that spans documents.</strong> Don't ask something that a single document can answer. Ask about connections: "How does X relate to Y?" or "What is the link between the findings in paper A and the methodology in paper B?" This is where GraphRAG earns its keep.</p>
</li>
<li class="">
<p><strong>Check the retrieval stats.</strong> In the chat response metadata, you'll see the retrieval mode (<code>full_graphrag</code>, <code>vector_only</code>, or <code>keyword_only</code>), the number of seed entities found, how many entities PageRank explored, and the breakdown of provenance versus vector chunks. This tells you exactly what the pipeline did for your query.</p>
</li>
</ol>
<p>GraphRAG is also available as an MCP tool called <code>graphrag_search</code>, meaning any AI assistant that supports MCP can use it directly against your Chaos Cypher instance. See our <a class="" href="https://chaoscypher.com/blog/mcp-server-launch">MCP launch post</a> for setup instructions with Claude Desktop, Cursor, and others. The tool accepts a query, an optional chunk limit, and optional source ID filters for scoping searches to specific documents.</p>
<p>If you want to fine-tune the pipeline for your specific use case, add a <code>graphrag</code> section to your <code>settings.yaml</code>:</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token key atrule">graphrag</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">seed_similarity_threshold</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.3</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">ppr_top_k</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">20</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">ppr_damping</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">0.85</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">max_triples</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">200</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">vector_overfetch_multiplier</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">3</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">max_graph_nodes</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">50000</span><br></div></code></pre></div></div>
<p>Most users will never need to touch these. The defaults were chosen based on the GraphRAG literature and testing across databases of varying sizes -- from small personal collections (hundreds of entities) to larger research corpora (tens of thousands of entities).</p>
<p><img decoding="async" loading="lazy" alt="Chat conversation with AI response and source citations" src="https://chaoscypher.com/assets/images/chat-conversation-2bd142bdd922a54b30ef33f85629de68.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="whats-next">What's Next<a href="https://chaoscypher.com/blog/graphrag-enhanced-search#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next" translate="no">​</a></h2>
<p>GraphRAG in Chaos Cypher today handles local queries well -- questions where you have a specific starting point and want to follow connections outward. But there's a class of questions it doesn't yet handle optimally: corpus-wide questions like "What are the main themes across all my documents?" or "Summarize everything related to sustainability."</p>
<p>These require what the research literature calls community summaries -- pre-computed summaries of entity clusters in the graph that can answer high-level questions without traversing the entire structure at query time. That's on the roadmap.</p>
<p>If you're working with a use case where multi-hop retrieval matters -- legal discovery, academic research, intelligence analysis, medical literature review -- we'd love to hear about your experience. What kinds of multi-hop questions does your work require? Where does the current pipeline fall short? The best way to reach us is through the project's GitHub discussions.</p>
<p>For a deeper look at the architecture, see the <a class="" href="https://chaoscypher.com/docs/user-guide/search">Search documentation</a> and the <a class="" href="https://chaoscypher.com/docs/getting-started/overview">Architecture overview</a>.</p>]]></content:encoded>
            <category>feature-launch</category>
            <category>ai</category>
        </item>
        <item>
            <title><![CDATA[Build a Private AI Knowledge Graph That Never Leaves Your Machine]]></title>
            <link>https://chaoscypher.com/blog/local-ai-knowledge-graph</link>
            <guid>https://chaoscypher.com/blog/local-ai-knowledge-graph</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Run Chaos Cypher with Ollama for a fully local AI knowledge graph — no API keys, no data leaving your network, same pipeline as cloud providers.]]></description>
            <content:encoded><![CDATA[<p>Every week, another AI tool asks you to upload your most sensitive documents to someone else's servers. Your contracts, medical records, internal research, personal journals -- all piped through APIs you don't control, stored in logs you can't audit, governed by terms of service that change without notice.</p>
<p>For a lot of use cases, that's fine. But there's a whole class of knowledge that simply cannot leave your network. Healthcare organizations bound by HIPAA. Law firms handling privileged communications. Financial institutions with regulatory obligations around client data. Companies whose competitive advantage lives in proprietary research. Or maybe you just have a journal and you'd rather not feed your inner monologue to a data center in Virginia.</p>
<p>The usual answer is "just don't use AI tools." That's not really an answer anymore. The productivity gap between AI-assisted knowledge work and manual knowledge work is too wide to ignore. The real question is: can you get the benefits of AI-powered knowledge graphs without the privacy tradeoffs?</p>
<p>Yes. Chaos Cypher paired with Ollama runs a complete AI knowledge graph pipeline -- document ingestion, entity extraction, relationship mapping, semantic search, and conversational chat -- entirely on your local machine. No API keys. No usage limits. No monthly bills. No data leaving your network. You install it, you run it, you own it.</p>
<p>This isn't a compromise or a toy demo. It's the same extraction pipeline, the same graph visualization, the same chat interface that works with cloud providers. You're just swapping the LLM backend from a remote API to a local one.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="from-zero-to-local-knowledge-graph">From Zero to Local Knowledge Graph<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#from-zero-to-local-knowledge-graph" class="hash-link" aria-label="Direct link to From Zero to Local Knowledge Graph" title="Direct link to From Zero to Local Knowledge Graph" translate="no">​</a></h2>
<p>Here's the full workflow, start to finish. Fifteen minutes if you're following along, five if you've done this before.</p>
<p><strong>Step 1: Install Ollama and pull a model.</strong></p>
<p>Head to <a href="https://ollama.com/" target="_blank" rel="noopener noreferrer" class="">ollama.com</a> and install it for your platform. Then pull a model:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token plain">ollama pull qwen3:30b</span><br></div></code></pre></div></div>
<p>That downloads the model weights once. After that, Ollama runs as a local API server -- same REST interface as OpenAI, but pointing at <code>localhost:11434</code>.</p>
<p><strong>Step 2: Start the Chaos Cypher stack.</strong></p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token function" style="color:#00fff0">make</span><span class="token plain"> docker-dev</span><br></div></code></pre></div></div>
<p>This brings up four containers: the Cortex API server, a Neuron background worker, the web Interface, and Valkey for job queuing. Everything talks to Ollama on your host machine through Docker's <code>host.docker.internal</code> bridge. No external network calls.</p>
<p><strong>Step 3: Upload a document.</strong></p>
<p>Open <code>http://localhost:3000</code>, create a database (or use the default), and drag a PDF, DOCX, or text file into the Sources page. Chaos Cypher immediately begins indexing -- chunking the document, generating embeddings, and building a search index. This takes about 30 seconds for a 100-page PDF and requires no GPU at all (more on that below).</p>
<p><strong>Step 4: Extract entities and relationships.</strong></p>
<p>Once indexing completes, kick off entity extraction. This is where the LLM does its work -- reading through each chunk, identifying entities (people, organizations, concepts, events), discovering relationships between them, and building a structured knowledge graph. Chaos Cypher automatically detects the type of document and applies <a class="" href="https://chaoscypher.com/blog/domain-extraction-guide">domain-specific extraction rules</a> for higher quality results. For a 100-page document with a 30B model, expect roughly 5-10 minutes.</p>
<p><img decoding="async" loading="lazy" alt="Sources list showing document processing status" src="https://chaoscypher.com/assets/images/sources-list-6c3bb05048206eec583c2eb2a0733141.png" width="1280" height="800" class="img_ev3q"></p>
<p><strong>Step 5: Chat with your knowledge graph.</strong></p>
<p>Once extraction finishes and the results are committed to your graph, open the Chat page and start asking questions. The chat system uses RAG (retrieval-augmented generation) to search your indexed documents and graph, then feeds the relevant context to your local LLM for a grounded answer. Everything stays on your machine -- the search, the retrieval, the generation.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="pick-your-preset">Pick Your Preset<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#pick-your-preset" class="hash-link" aria-label="Direct link to Pick Your Preset" title="Direct link to Pick Your Preset" translate="no">​</a></h3>
<p>Not everyone has the same GPU. Chaos Cypher ships with VRAM presets that auto-configure the right model, context window, and batch size for your hardware. Select a preset in Settings and it handles the rest.</p>
<table><thead><tr><th>VRAM</th><th>Chat Model</th><th>Extraction Model</th><th>Context</th><th>GPU Examples</th></tr></thead><tbody><tr><td>16 GB</td><td>Phi4 14B</td><td>Phi4 14B</td><td>16K</td><td>RTX 4080, RTX 5080</td></tr><tr><td>20 GB</td><td>Phi4 14B</td><td>Phi4 14B</td><td>24K</td><td>RTX 5080 Super</td></tr><tr><td>24 GB</td><td>Qwen3 30B</td><td>Qwen3 30B Instruct</td><td>16K</td><td>RTX 4090, RTX 3090</td></tr><tr><td>32 GB</td><td>Qwen3 30B</td><td>Qwen3 30B Instruct</td><td>32K</td><td>RTX 4090, RTX 3090</td></tr><tr><td>48 GB</td><td>Qwen3 30B</td><td>Qwen3 30B Instruct</td><td>48K</td><td>A6000, 2x 4090</td></tr><tr><td>96 GB</td><td>Qwen 2.5 72B</td><td>Qwen 2.5 72B Instruct</td><td>48K</td><td>H100</td></tr><tr><td>128 GB</td><td>Qwen 2.5 72B</td><td>Qwen 2.5 72B Instruct</td><td>64K</td><td>Multi-H100</td></tr></tbody></table>
<p>The sweet spot for most people is 24 GB. An RTX 4090 running Qwen3 30B gives you strong chat quality and solid extraction results. If you're on 16 GB, you'll still get a good experience for chat and search -- extraction quality will be noticeably lower on complex documents, but perfectly usable for straightforward material.</p>
<p><img decoding="async" loading="lazy" alt="LLM provider settings with Ollama configuration and VRAM preset" src="https://chaoscypher.com/assets/images/settings-llm-provider-47d95872c8ce8f83ee8ab7b204416df9.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="under-the-hood">Under the Hood<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#under-the-hood" class="hash-link" aria-label="Direct link to Under the Hood" title="Direct link to Under the Hood" translate="no">​</a></h2>
<p>A few things are worth knowing about how the local pipeline actually works.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="embeddings-are-always-local">Embeddings Are Always Local<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#embeddings-are-always-local" class="hash-link" aria-label="Direct link to Embeddings Are Always Local" title="Direct link to Embeddings Are Always Local" translate="no">​</a></h3>
<p>Here's something that surprises people: the embedding model that powers semantic search runs on CPU. It has nothing to do with Ollama or your GPU. Chaos Cypher defaults to Qwen3-Embedding-0.6B, a compact model that downloads once and runs locally via sentence-transformers. Any HuggingFace sentence-transformers model can be used, and cloud providers (OpenAI, Ollama, Gemini) are also supported.</p>
<p>This means semantic search works even if Ollama is offline. It means you can index thousands of documents on a machine with no GPU at all. The embeddings are generated in the Neuron worker during indexing and stored in your local SQLite database (via sqlite-vec). Search queries generate an embedding on the fly, compare it against the index, and return results -- all on CPU, all local, typically in under a second.</p>
<p>Re-ranking also runs locally. Chaos Cypher uses a cross-encoder model (Alibaba-NLP/gte-reranker-modernbert-base, 149M parameters, ~600 MB) via sentence-transformers to re-rank search results by relevance before passing them to the LLM. No API calls involved. The ModernBERT-based model scores ~56.2 NDCG@10 on the BEIR benchmark -- significantly more accurate than smaller models on diverse, out-of-domain queries. Any HuggingFace cross-encoder can be swapped in via settings.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="multi-instance-load-balancing">Multi-Instance Load Balancing<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#multi-instance-load-balancing" class="hash-link" aria-label="Direct link to Multi-Instance Load Balancing" title="Direct link to Multi-Instance Load Balancing" translate="no">​</a></h3>
<p>Have multiple machines with GPUs? Or multiple GPUs in one workstation? You can point Chaos Cypher at all of them. Configure multiple Ollama instances in your settings, and the load balancer distributes requests across them with three strategies:</p>
<ul>
<li class=""><strong>Round-robin</strong> -- simple alternation, good for identical hardware</li>
<li class=""><strong>Least-loaded</strong> -- sends requests to whichever instance has the fewest active jobs</li>
<li class=""><strong>Random</strong> -- exactly what it sounds like</li>
</ul>
<p>Each instance gets independent health checks. If one goes down, the load balancer automatically fails over to the healthy instances. When it comes back, it rejoins the pool. The configuration is hot-reloadable -- add or remove instances from the Settings page without restarting anything. In-flight requests drain gracefully before an instance is removed.</p>
<p>This is particularly useful for extraction workloads. A 500-page document produces hundreds of chunk groups to process. Spreading that across two or three GPUs cuts extraction time proportionally.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="thinking-mode">Thinking Mode<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#thinking-mode" class="hash-link" aria-label="Direct link to Thinking Mode" title="Direct link to Thinking Mode" translate="no">​</a></h3>
<p>Qwen3 models support an extended reasoning mode using <code>&lt;think&gt;</code> tags -- the model works through its reasoning step by step before producing a final answer. Chaos Cypher detects and handles this automatically. When thinking is enabled for chat, the model's internal reasoning is extracted and available separately from the final response. For models that don't support thinking tags, everything works normally -- no configuration needed, graceful fallback.</p>
<p>Thinking is currently best suited for chat interactions where you want more careful, reasoned responses. For extraction tasks, the overhead of reasoning tokens tends to slow things down without a proportional quality improvement, so Chaos Cypher disables it for extraction by default. You can toggle this per-operation type in settings.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="performance-reality-check">Performance Reality Check<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#performance-reality-check" class="hash-link" aria-label="Direct link to Performance Reality Check" title="Direct link to Performance Reality Check" translate="no">​</a></h3>
<p>Let's be honest about the tradeoffs, because nobody benefits from hype.</p>
<p><strong>Chat is great locally.</strong> Interactive question-answering with RAG retrieval works well on 24 GB+ hardware. The model has context from your documents, it generates coherent answers, latency is acceptable for interactive use. Streaming means you see tokens as they arrive -- the experience feels responsive even when total generation takes a few seconds.</p>
<p><strong>Simple extraction works well.</strong> Documents with clear entity boundaries -- people's names, organization names, dates, locations -- extract reliably on local models. Legal contracts with named parties and defined obligations, research papers with cited authors and institutions, meeting notes with action items and owners.</p>
<p><strong>Complex extraction is where you notice the gap.</strong> Dense academic papers with nuanced conceptual relationships, documents where entities are implied rather than stated, multi-hop reasoning about how concepts relate to each other -- this is where cloud models with 100B+ parameters still have a meaningful advantage. A Qwen3 30B model will get you 70-80% of what Claude or GPT-4.1 would produce on hard extraction tasks. For many use cases, that's more than enough. For others, you'll want to use a cloud provider for the extraction pass and keep everything else local.</p>
<p>The good news: Chaos Cypher lets you mix and match. Use Ollama for chat and search (where privacy matters most, since those are interactive queries about your data), and use a cloud provider for the one-time extraction pass if you need maximum quality. Or keep everything local and accept the quality tradeoff. Your call.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="four-providers-one-interface">Four Providers, One Interface<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#four-providers-one-interface" class="hash-link" aria-label="Direct link to Four Providers, One Interface" title="Direct link to Four Providers, One Interface" translate="no">​</a></h3>
<p>Chaos Cypher supports four LLM providers through a unified interface:</p>
<ul>
<li class=""><strong>Ollama</strong> -- local models, no API key, no cost</li>
<li class=""><strong>OpenAI</strong> -- GPT-4.1, high-quality extraction</li>
<li class=""><strong>Anthropic</strong> -- Claude Sonnet 4.5, strong reasoning</li>
<li class=""><strong>Gemini</strong> -- Gemini 2.5 Pro, massive context window</li>
</ul>
<p>Switching between them is a single config change. The same entity extraction pipeline, the same chat system, the same search infrastructure. You can start with Ollama to prove the workflow works, then switch to a cloud provider for production extraction, or vice versa. You can even use different providers for different operations -- Ollama for chat, OpenAI for extraction.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="try-it-yourself">Try It Yourself<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#try-it-yourself" class="hash-link" aria-label="Direct link to Try It Yourself" title="Direct link to Try It Yourself" translate="no">​</a></h2>
<p>Minimal configuration in <code>data/settings.yaml</code>:</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token key atrule">LLM</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">chat_provider</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"ollama"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">ollama_chat_model</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"qwen3:30b-instruct"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">ollama_num_ctx</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">32768</span><br></div></code></pre></div></div>
<p>The default Ollama instance points at <code>http://host.docker.internal:11434</code>,
which Just Works™ for the all-in-one container talking to a host-side
Ollama. To override the URL or add multi-GPU instances, use
<code>ollama_instances</code>.</p>
<p>Or skip the YAML entirely -- open the Settings page in the UI, select Ollama as your provider, pick a VRAM preset that matches your GPU, and you're done. The preset fills in the model name, context window, batch size, and extraction model automatically.</p>
<p>Then start everything:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token function" style="color:#00fff0">make</span><span class="token plain"> docker-dev</span><br></div></code></pre></div></div>
<p>Upload a document, wait for indexing (30 seconds) and extraction (a few minutes), and you have a working knowledge graph built entirely on your hardware.</p>
<p><img decoding="async" loading="lazy" alt="Knowledge graph visualization showing extracted entities and relationships" src="https://chaoscypher.com/assets/images/graph-visualization-4b8c7b064f3d310b7a0e58e3f5de5ea8.png" width="1280" height="800" class="img_ev3q"></p>
<p>A few tips for getting the best results:</p>
<ul>
<li class=""><strong>Pull models before starting Chaos Cypher.</strong> Run <code>ollama pull qwen3:30b</code> (or whichever model your preset uses) before your first extraction. The Neuron worker will wait for Ollama, but pre-pulling avoids the initial download delay.</li>
<li class=""><strong>Monitor VRAM usage.</strong> Run <code>nvidia-smi</code> to see how much VRAM your model is using. If you're near the limit, drop to a smaller context window or a smaller model. OOM kills during extraction are recoverable (the job retries), but they're slow.</li>
<li class=""><strong>Start with shorter documents.</strong> Your first upload should be a 10-20 page document so you can see the full pipeline complete in a couple of minutes. Scale up once you're comfortable with the output quality.</li>
<li class=""><strong>Experiment with extraction models.</strong> The presets pair specific extraction models with chat models. The extraction model uses an instruct-tuned variant optimized for structured output. If extraction quality isn't where you want it, try the next VRAM tier up -- the jump from 8B to 30B parameters makes a significant difference in extraction accuracy.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="whats-next">What's Next<a href="https://chaoscypher.com/blog/local-ai-knowledge-graph#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next" translate="no">​</a></h2>
<p>Running everything locally is the starting point, not the ceiling.</p>
<p>If you outgrow a single GPU, the multi-instance setup lets you spread load across multiple machines on your network -- a small GPU cluster for your team, still fully private, still no cloud dependency. Configure two or three Ollama instances on different machines, point Chaos Cypher at all of them, and extraction workloads parallelize automatically.</p>
<p>When you do need cloud-tier quality for specific tasks, the cloud providers are there. Chaos Cypher doesn't lock you into local-only or cloud-only. You choose per-operation, per-database, whenever you want. The architecture is the same either way -- the only thing that changes is where the LLM inference happens.</p>
<p>The privacy argument isn't really about paranoia. It's about control. Your knowledge graph is a map of everything you know -- your research, your relationships, your institutional memory. Keeping that map on your own hardware isn't a limitation. It's a feature.</p>]]></content:encoded>
            <category>workflows</category>
            <category>privacy</category>
        </item>
        <item>
            <title><![CDATA[Give Any AI Assistant Direct Access to Your Knowledge Graph with MCP]]></title>
            <link>https://chaoscypher.com/blog/mcp-server-launch</link>
            <guid>https://chaoscypher.com/blog/mcp-server-launch</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Chaos Cypher's MCP server lets Claude Desktop, Cursor, and other AI tools query and build your knowledge graph directly — no copy-paste required.]]></description>
            <content:encoded><![CDATA[<p>Your knowledge graph is stuck in a browser tab. You built something valuable -- a map of entities, relationships, and source documents that represents real understanding of a domain. But the moment you switch to Claude to write a report, or open Cursor to write code, or ask ChatGPT to help with analysis, that knowledge graph might as well not exist. You're back to copying text, pasting context, and manually cross-referencing. Two tools that should be working together are stuck in separate worlds.</p>
<p>Chaos Cypher now speaks MCP, which means any AI assistant that supports the protocol -- Claude Desktop, Claude Code, Cursor, Windsurf, and a growing list of others -- can directly query, search, traverse, and even write to your knowledge graph. No copy-paste. No context switching. Just ask.</p>
<p>This post walks through what that actually looks like, what's under the hood, and how to set it up in about two minutes.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="what-is-mcp-and-why-should-you-care">What Is MCP, and Why Should You Care?<a href="https://chaoscypher.com/blog/mcp-server-launch#what-is-mcp-and-why-should-you-care" class="hash-link" aria-label="Direct link to What Is MCP, and Why Should You Care?" title="Direct link to What Is MCP, and Why Should You Care?" translate="no">​</a></h2>
<p>MCP stands for Model Context Protocol. Anthropic released it as an open standard, and the simplest analogy is USB-C for AI tools. Before USB-C, every device had its own charger, its own cable, its own connector. MCP does the same thing for AI integrations: it defines one protocol that any AI host can use to talk to any tool server.</p>
<p>Instead of building a custom plugin for Claude, another for ChatGPT, another for Cursor, and another for every new AI tool that launches next month, you build one MCP server. Every compatible AI tool can use it immediately.</p>
<p>The adoption has been fast. Claude Desktop, Claude Code, Cursor, Windsurf, Cline, and Continue all support MCP today. The protocol handles tool discovery (the AI asks "what can you do?"), tool invocation (the AI calls a function with parameters), and result streaming. From the AI's perspective, your knowledge graph becomes just another set of capabilities it can use to answer questions.</p>
<p>From your perspective, it means you stop being the middleman between your data and your AI.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="what-this-actually-looks-like">What This Actually Looks Like<a href="https://chaoscypher.com/blog/mcp-server-launch#what-this-actually-looks-like" class="hash-link" aria-label="Direct link to What This Actually Looks Like" title="Direct link to What This Actually Looks Like" translate="no">​</a></h2>
<p>The best way to understand MCP is to see the before and after.</p>
<p><strong>Before MCP:</strong> You have a knowledge graph with 200 entities extracted from research papers on gene therapy. You're writing a literature review in Claude. To reference your graph, you open Chaos Cypher in another tab, run a search, copy the results, paste them into Claude, ask your question, realize you need more context, go back to the graph, find related entities, copy those too, paste again. Repeat until frustrated.</p>
<p><strong>After MCP:</strong> You tell Claude: "Search my knowledge graph for all entities related to CRISPR and find the shortest path to gene therapy applications." Claude calls <code>graphrag_search</code> to find relevant entities and document passages, then calls <code>find_shortest_path</code> to trace the relationship chain. You get a grounded answer with specific entities and relationships from your own research, in one turn.</p>
<p>Here are three scenarios that show the range of what's possible.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="scenario-1-research----connecting-the-dots">Scenario 1: Research -- Connecting the Dots<a href="https://chaoscypher.com/blog/mcp-server-launch#scenario-1-research----connecting-the-dots" class="hash-link" aria-label="Direct link to Scenario 1: Research -- Connecting the Dots" title="Direct link to Scenario 1: Research -- Connecting the Dots" translate="no">​</a></h3>
<p>You've been building a knowledge graph from papers on quantum computing and machine learning. You're deep in a writing session in Claude Desktop and want to understand where these two fields intersect in your collected research.</p>
<p>You ask: <em>"What are the connections between quantum computing and machine learning in my research? Show me the key entities and how they're related."</em></p>
<p>Claude calls <code>search_nodes</code> to find nodes matching both topics, then <code>get_node_context</code> to pull the immediate neighborhood of the most central ones, including the edges that connect them and the source document chunks that support each relationship. You get back a structured map of how your research connects these fields -- not a generic internet answer, but one grounded in the specific papers you've indexed.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="scenario-2-coding----your-projects-knowledge-base-in-your-editor">Scenario 2: Coding -- Your Project's Knowledge Base in Your Editor<a href="https://chaoscypher.com/blog/mcp-server-launch#scenario-2-coding----your-projects-knowledge-base-in-your-editor" class="hash-link" aria-label="Direct link to Scenario 2: Coding -- Your Project's Knowledge Base in Your Editor" title="Direct link to Scenario 2: Coding -- Your Project's Knowledge Base in Your Editor" translate="no">​</a></h3>
<p>You're in Cursor, working on a codebase that has an associated knowledge graph mapping its architecture -- services, APIs, data flows, dependencies. You need to understand how the authentication service connects to the billing pipeline.</p>
<p>You ask: <em>"Traverse from the Authentication Service node to anything related to billing. What's the path?"</em></p>
<p>Cursor calls <code>resolve_node</code> to find the canonical node for "Authentication Service" (even if you didn't remember the exact label), then <code>traverse_path</code> to walk the graph two hops out, filtered to the relevant edge types. You see the chain: Authentication Service -&gt; User Session -&gt; Subscription Manager -&gt; Billing Pipeline. Without leaving your editor.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="scenario-3-writing----summarize-with-citations">Scenario 3: Writing -- Summarize With Citations<a href="https://chaoscypher.com/blog/mcp-server-launch#scenario-3-writing----summarize-with-citations" class="hash-link" aria-label="Direct link to Scenario 3: Writing -- Summarize With Citations" title="Direct link to Scenario 3: Writing -- Summarize With Citations" translate="no">​</a></h3>
<p>You're drafting a report and need to summarize everything in your knowledge graph about a specific topic, with citations back to the original source documents.</p>
<p>You ask: <em>"Summarize all my sources related to climate policy in the European Union. Include which documents each claim comes from."</em></p>
<p>Claude calls <code>get_summary_context</code> to retrieve and cluster document chunks relevant to the query. Because this tool returns the raw chunks with their source metadata rather than making an LLM call, Claude itself does the summarization -- giving you a synthesis grounded in your documents, with each claim traced back to a specific source.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="under-the-hood-30-tools-7-categories">Under the Hood: 30 Tools, 7 Categories<a href="https://chaoscypher.com/blog/mcp-server-launch#under-the-hood-30-tools-7-categories" class="hash-link" aria-label="Direct link to Under the Hood: 30 Tools, 7 Categories" title="Direct link to Under the Hood: 30 Tools, 7 Categories" translate="no">​</a></h2>
<p>Chaos Cypher exposes 30 tools through MCP, organized into seven categories. The design principle is that read operations are always safe and always available. Write operations are opt-in.</p>
<table><thead><tr><th>Category</th><th>Read</th><th>Write</th><th>What It Does</th></tr></thead><tbody><tr><td><strong>GraphRAG</strong></td><td><code>graphrag_search</code></td><td>--</td><td>The flagship tool. Fuses Personalized PageRank over the knowledge graph with hybrid vector/keyword search. Finds answers that pure vector search misses because it follows relationships.</td></tr><tr><td><strong>Nodes</strong></td><td><code>search_nodes</code>, <code>search_chunks</code>, <code>get_node</code>, <code>get_node_context</code>, <code>resolve_node</code></td><td><code>create_node</code>, <code>update_node</code>, <code>delete_node</code></td><td>Full CRUD for graph nodes. Search by name, properties, or semantic similarity. Resolve aliases to canonical nodes. Get a node's full neighborhood with edges and supporting document chunks.</td></tr><tr><td><strong>Edges</strong></td><td><code>list_edges</code>, <code>get_node_edges</code></td><td><code>create_edge</code></td><td>Explore and create relationships. Filter by direction (incoming/outgoing), edge type, or connected node.</td></tr><tr><td><strong>Templates</strong></td><td><code>list_templates</code>, <code>search_templates</code></td><td><code>create_template</code>, <code>delete_template</code></td><td>Templates define the schema for nodes and edges. Search by name or description. Create new types on the fly.</td></tr><tr><td><strong>Analytics</strong></td><td><code>analyze_graph_structure</code>, <code>find_shortest_path</code>, <code>find_similar_nodes</code>, <code>traverse_path</code></td><td>--</td><td>Structural analysis: community detection, PageRank centrality, degree distribution. Path finding between any two nodes. Semantic similarity via embeddings. Multi-hop traversal with depth and type filters.</td></tr><tr><td><strong>Documents</strong></td><td><code>get_summary_context</code>, <code>get_document_status</code></td><td><code>add_document</code>, <code>wait_for_document</code>, <code>remove_document</code></td><td>MCP-native document management. Queue files for background indexing and entity extraction. Check processing status. Wait for completion. Retrieve clustered chunks for summarization. Full cascade delete.</td></tr><tr><td><strong>Extraction</strong></td><td><code>get_extraction_tasks</code>, <code>get_extraction_chunks</code>, <code>get_extraction_progress</code></td><td><code>submit_chunk_extraction</code>, <code>finalize_extraction</code></td><td>Client-driven entity extraction. The AI assistant reads chunks, extracts entities itself, and submits results back — no server LLM required. Track progress and finalize to commit to the knowledge graph.</td></tr></tbody></table>
<p><strong>Read/write mode split:</strong> 19 tools are read-only and always available. 11 tools require write mode to be explicitly enabled. This is controlled by a single setting -- if you're not comfortable with an AI modifying your graph, just leave it in read mode. The AI can still search, traverse, and analyze everything.</p>
<p><strong>Two transport modes:</strong> The MCP server runs in two ways depending on your setup:</p>
<ul>
<li class=""><strong>stdio</strong> -- For desktop AI tools like Claude Desktop and Cursor. The CLI starts a server that communicates over standard input/output. No network involved.</li>
<li class=""><strong>Streamable HTTP</strong> -- For the Docker stack. The Cortex API exposes MCP at <code>/api/v1/mcp</code> using the Streamable HTTP transport, so any MCP client on the network can connect.</li>
</ul>
<p>Both transports expose the same 30 tools with the same behavior. The only difference is how they're connected.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="try-it-yourself">Try It Yourself<a href="https://chaoscypher.com/blog/mcp-server-launch#try-it-yourself" class="hash-link" aria-label="Direct link to Try It Yourself" title="Direct link to Try It Yourself" translate="no">​</a></h2>
<p>Setup depends on how you run Chaos Cypher. Three paths, all quick.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="path-1-cli--claude-desktop">Path 1: CLI + Claude Desktop<a href="https://chaoscypher.com/blog/mcp-server-launch#path-1-cli--claude-desktop" class="hash-link" aria-label="Direct link to Path 1: CLI + Claude Desktop" title="Direct link to Path 1: CLI + Claude Desktop" translate="no">​</a></h3>
<p>If you have Chaos Cypher installed as a CLI tool, add this to your Claude Desktop configuration file (<code>claude_desktop_config.json</code>):</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"mcpServers"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"chaoscypher"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"command"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"chaoscypher"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"args"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"mcp"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain"></span><span class="token punctuation" style="color:#808098">}</span><br></div></code></pre></div></div>
<p>Restart Claude Desktop. You should see Chaos Cypher listed as an available MCP server in the tools panel.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="path-2-cli--claude-code">Path 2: CLI + Claude Code<a href="https://chaoscypher.com/blog/mcp-server-launch#path-2-cli--claude-code" class="hash-link" aria-label="Direct link to Path 2: CLI + Claude Code" title="Direct link to Path 2: CLI + Claude Code" translate="no">​</a></h3>
<p>One command:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token plain">claude mcp </span><span class="token function" style="color:#00fff0">add</span><span class="token plain"> chaoscypher -- chaoscypher mcp</span><br></div></code></pre></div></div>
<p>That's it. Claude Code will discover Chaos Cypher's tools automatically on the next session.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="path-3-cli--cursor">Path 3: CLI + Cursor<a href="https://chaoscypher.com/blog/mcp-server-launch#path-3-cli--cursor" class="hash-link" aria-label="Direct link to Path 3: CLI + Cursor" title="Direct link to Path 3: CLI + Cursor" translate="no">​</a></h3>
<p>Add this to your Cursor MCP configuration (<code>.cursor/mcp.json</code> in your project, or the global settings):</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"mcpServers"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"chaoscypher"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"command"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"chaoscypher"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"args"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"mcp"</span><span class="token punctuation" style="color:#808098">]</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"transportType"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"stdio"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain"></span><span class="token punctuation" style="color:#808098">}</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="path-4-docker-stack-already-running">Path 4: Docker Stack (Already Running)<a href="https://chaoscypher.com/blog/mcp-server-launch#path-4-docker-stack-already-running" class="hash-link" aria-label="Direct link to Path 4: Docker Stack (Already Running)" title="Direct link to Path 4: Docker Stack (Already Running)" translate="no">​</a></h3>
<p>If you run Chaos Cypher via <code>docker-compose</code>, the MCP endpoint is already live. Your Cortex API serves MCP at:</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token plain">http://localhost:8080/api/v1/mcp</span><br></div></code></pre></div></div>
<p>Any MCP client that supports the Streamable HTTP transport can connect directly. No additional configuration on the Chaos Cypher side.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="configuring-access-mode">Configuring Access Mode<a href="https://chaoscypher.com/blog/mcp-server-launch#configuring-access-mode" class="hash-link" aria-label="Direct link to Configuring Access Mode" title="Direct link to Configuring Access Mode" translate="no">​</a></h3>
<p>By default, MCP runs in read-only mode. To enable write tools (creating nodes, adding documents, etc.), update your <code>settings.yaml</code>:</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token key atrule">mcp</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">mode</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> write         </span><span class="token comment" style="color:#505068;font-style:italic"># "read" (default) or "write" for full access</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token key atrule">auto_extract</span><span class="token punctuation" style="color:#808098">:</span><span class="token plain"> </span><span class="token boolean important" style="color:#7b2ff7">true</span><span class="token plain">  </span><span class="token comment" style="color:#505068;font-style:italic"># auto-extract entities from documents uploaded via MCP</span><br></div></code></pre></div></div>
<p>Read mode exposes the 19 read tools. Write mode exposes all 30. The <code>auto_extract</code> flag controls whether documents uploaded via the <code>add_document</code> tool automatically go through entity extraction after indexing, or just get chunked and embedded for RAG search.</p>
<p>If you're using the CLI with a specific database, pass it as a flag:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token plain">chaoscypher mcp </span><span class="token parameter variable" style="color:#c8c8e0">--database</span><span class="token plain"> my-research</span><br></div></code></pre></div></div>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="your-data-stays-local">Your Data Stays Local<a href="https://chaoscypher.com/blog/mcp-server-launch#your-data-stays-local" class="hash-link" aria-label="Direct link to Your Data Stays Local" title="Direct link to Your Data Stays Local" translate="no">​</a></h2>
<p>This is worth stating explicitly: MCP doesn't send your knowledge graph data to any external service. The protocol is a local communication channel between the AI tool running on your machine and the Chaos Cypher server running on your machine (or your network, if you use Docker). When Claude calls <code>graphrag_search</code>, the query goes from Claude to your local MCP server, your server searches your local database, and the results go back to Claude. Your documents, entities, and relationships never leave your infrastructure.</p>
<p>The AI model itself runs wherever it runs -- that's between you and your provider. But the knowledge graph data stays entirely under your control. If you pair Chaos Cypher with a local model via Ollama, the entire pipeline is air-gapped. See our <a class="" href="https://chaoscypher.com/blog/local-ai-knowledge-graph">local AI setup guide</a> for the full walkthrough.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="whats-next">What's Next<a href="https://chaoscypher.com/blog/mcp-server-launch#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next" translate="no">​</a></h2>
<p>MCP support is the foundation for a broader vision: your knowledge graph as a persistent layer that any tool in your workflow can tap into. Here's what's on the roadmap:</p>
<ul>
<li class=""><strong>Prompt templates</strong> -- Pre-built MCP prompts for common patterns like "summarize this topic with citations" or "find contradictions in my sources," so you don't have to craft the right question every time.</li>
<li class=""><strong>Resource exposure</strong> -- Making graph nodes and documents available as MCP resources, so AI tools can browse your knowledge graph like a file system.</li>
<li class=""><strong>Multi-database switching</strong> -- Seamlessly switch between knowledge graphs within a single MCP session.</li>
</ul>
<p>The flagship <code>graphrag_search</code> tool deserves its own explanation -- it's doing a lot more than keyword lookup. Read <a class="" href="https://chaoscypher.com/blog/graphrag-enhanced-search">how GraphRAG works</a> for the full deep-dive on the retrieval pipeline.</p>
<p>The MCP server ships with Chaos Cypher today. If you're already running it, you have it -- just configure your AI tool and go.</p>
<ul>
<li class=""><strong>Documentation:</strong> Full MCP setup guide and tool reference in the <a class="" href="https://chaoscypher.com/docs/user-guide/mcp">docs</a></li>
<li class=""><strong>Source:</strong> The MCP implementation lives in the <code>chaoscypher_core.mcp</code> package.</li>
<li class=""><strong>Issues:</strong> Found a bug or have a feature request? <a href="https://github.com/chaoscypherinc/chaoscypher-docs/issues" target="_blank" rel="noopener noreferrer" class="">Open an issue</a> or <a href="https://github.com/chaoscypherinc/chaoscypher-docs/discussions" target="_blank" rel="noopener noreferrer" class="">start a discussion</a></li>
</ul>
<p>The gap between "having a knowledge graph" and "using a knowledge graph" has always been the friction of switching contexts. MCP closes that gap. Your knowledge graph is no longer a destination you visit -- it's a capability that follows you into whatever tool you're already working in.</p>]]></content:encoded>
            <category>feature-launch</category>
        </item>
        <item>
            <title><![CDATA[Automate Your Knowledge Pipeline: Triggers, Workflows, and AI Tools]]></title>
            <link>https://chaoscypher.com/blog/workflow-automation-guide</link>
            <guid>https://chaoscypher.com/blog/workflow-automation-guide</guid>
            <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Build event-driven knowledge pipelines in Chaos Cypher using triggers, multi-step workflows, and composable AI tools — no manual babysitting required.]]></description>
            <content:encoded><![CDATA[<p>Most knowledge management tools treat you like a filing clerk. Upload a document, wait for extraction, manually review entities, fix errors, tag things, connect things. Then do it all again for the next document. And the next. And the next fifty.</p>
<p>This is fine when you have ten documents. It falls apart at a hundred. It becomes genuinely painful at a thousand. The bottleneck is never the AI -- it's the human loop. Every document requires your attention, your judgment calls, your clicks. The extraction might take thirty seconds. Your review and cleanup take ten minutes.</p>
<p>Chaos Cypher's workflow engine exists to close that gap. You define a processing pipeline once -- what to extract, how to validate it, where to send notifications -- and every new document flows through it automatically. No babysitting. No repetitive clicking. You set the rules, the system follows them.</p>
<p>This isn't a cron job bolted onto the side. It's a proper workflow engine with event-driven triggers, conditional branching, step-to-step data passing, and composable AI tools. Let me walk you through how it works.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="a-concrete-workflow-auto-processing-research-papers">A Concrete Workflow: Auto-Processing Research Papers<a href="https://chaoscypher.com/blog/workflow-automation-guide#a-concrete-workflow-auto-processing-research-papers" class="hash-link" aria-label="Direct link to A Concrete Workflow: Auto-Processing Research Papers" title="Direct link to A Concrete Workflow: Auto-Processing Research Papers" translate="no">​</a></h2>
<p>Abstractions are boring. Let's look at a real workflow you might build: automatically processing research papers as they're uploaded.</p>
<p>Here's the pipeline:</p>
<p><strong>Trigger:</strong> A new source file is uploaded (<code>file.upload</code> event fires).</p>
<p><strong>Step 1 -- AI Prompt:</strong> Summarize the document in three sentences. The <code>ai.prompt</code> tool sends the document text to your configured LLM with instructions to produce a concise summary. If the document is long, it automatically chunks the text and processes sections in parallel, then merges the results.</p>
<p><strong>Step 2 -- AI Extract JSON:</strong> Pull out structured metadata. Authors, publication date, journal name, key findings, methodology type. The <code>ai.extract_json</code> tool takes the document text and a JSON schema defining exactly what you want, then returns validated structured data. It retries if the extraction doesn't match the schema.</p>
<p><strong>Step 3 -- Conditional:</strong> Check if this is a clinical study. The <code>logic.conditional</code> tool evaluates whether <code>{{steps.step_2.methodology_type}}</code> equals <code>"clinical_trial"</code>. If true, the workflow branches to run additional <a class="" href="https://chaoscypher.com/blog/domain-extraction-guide">medical-domain extraction</a>. If false, it skips ahead.</p>
<p><strong>Step 4 -- HTTP Request:</strong> Post a notification to a Slack webhook with the summary from Step 1 and the metadata from Step 2. The <code>http.request</code> tool sends a POST to your webhook URL with a JSON body containing <code>{{steps.step_1.result}}</code> and <code>{{steps.step_2.extracted_data}}</code>.</p>
<p>Notice the <code>{{steps.step_1.result}}</code> syntax. That's the interpolation engine at work. Every step's output is available to every subsequent step via dot-notation paths. You can reference <code>{{inputs.document_text}}</code> for the original trigger data, <code>{{steps.step_2.extracted_data.authors}}</code> for a nested field from a previous step, or even <code>{{steps.step_3.branch_taken}}</code> to see which conditional path was followed. The interpolation preserves types too -- if a previous step returned a number, you get a number, not the string <code>"42"</code>.</p>
<p>This entire pipeline runs without human intervention. Upload a PDF, walk away, come back to a summarized, metadata-tagged, conditionally-processed document with a Slack notification waiting for you.</p>
<p><img decoding="async" loading="lazy" alt="Workflow list showing automation with status and controls" src="https://chaoscypher.com/assets/images/workflows-list-69b23eb07ee5739a801b649934b03772.png" width="1280" height="800" class="img_ev3q"></p>
<p><img decoding="async" loading="lazy" alt="Queue monitor showing task status and history" src="https://chaoscypher.com/assets/images/queue-monitor-06064e811e38be2c40b29ba6380f4163.png" width="1280" height="800" class="img_ev3q"></p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="under-the-hood-10-built-in-tools">Under the Hood: 10 Built-In Tools<a href="https://chaoscypher.com/blog/workflow-automation-guide#under-the-hood-10-built-in-tools" class="hash-link" aria-label="Direct link to Under the Hood: 10 Built-In Tools" title="Direct link to Under the Hood: 10 Built-In Tools" translate="no">​</a></h2>
<p>The workflow engine ships with ten built-in tools organized into five categories. Each tool has a defined input schema and output schema, so the system validates your configuration before anything runs.</p>
<table><thead><tr><th>Category</th><th>Tools</th><th>What They Do</th></tr></thead><tbody><tr><td><strong>AI</strong></td><td><code>ai.prompt</code>, <code>ai.extract_json</code>, <code>ai.vector_search</code>, <code>ai.generate_embedding</code></td><td>LLM interactions with chunking support, structured JSON extraction with schema validation and retries, semantic search across your knowledge graph, vector embedding generation for entities</td></tr><tr><td><strong>Data</strong></td><td><code>data.extract</code>, <code>data.merge</code></td><td>Pull values from nested objects using dot-notation paths (<code>user.addresses.0.city</code>), merge multiple dictionaries with shallow or deep strategies</td></tr><tr><td><strong>Logic</strong></td><td><code>logic.conditional</code>, <code>logic.loop</code></td><td>If/then branching with safe expression evaluation, iterate over collections with configurable limits</td></tr><tr><td><strong>HTTP</strong></td><td><code>http.request</code></td><td>External API calls with all HTTP methods, bearer/basic auth, configurable timeouts, and SSRF protection that blocks localhost access</td></tr><tr><td><strong>Templates</strong></td><td><code>templates.list</code></td><td>Query your knowledge graph schema to discover available node templates</td></tr></tbody></table>
<p>A few things worth highlighting about specific tools:</p>
<p><strong><code>ai.prompt</code></strong> is smarter than a simple LLM call. It supports chunk strategies (<code>quick</code> and <code>full</code>) for documents that exceed the model's context window. When chunking is enabled, it splits the document on paragraph boundaries, processes each chunk in parallel via the LLM queue, and intelligently merges the results -- concatenating text outputs, extending arrays, and merging objects.</p>
<p><strong><code>ai.extract_json</code></strong> enforces structure. You provide a JSON schema defining what you expect (say, <code>{"entities": [{"name": "string", "type": "string"}]}</code>), and the tool validates the LLM's output against it. If the output doesn't match, it retries automatically. This makes extraction reliable enough to run unattended.</p>
<p><strong><code>ai.vector_search</code></strong> lets workflows query the knowledge graph semantically. Give it a natural language query and it performs hybrid search -- combining vector similarity with keyword fallback -- to find matching nodes. You can filter by template type and set a similarity threshold. This is how you build workflows that reason about existing knowledge: "find all entities similar to what we just extracted and check for duplicates."</p>
<p><strong><code>http.request</code></strong> has built-in security. URLs are validated before any request is sent -- only <code>http</code> and <code>https</code> schemes are allowed, and direct <code>localhost</code> access is blocked to prevent SSRF attacks. It supports all standard HTTP methods (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS), bearer and basic authentication, custom headers, and JSON or string request bodies. Since Chaos Cypher runs in Docker, access to other containers via their service names works fine.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="how-triggers-work">How Triggers Work<a href="https://chaoscypher.com/blog/workflow-automation-guide#how-triggers-work" class="hash-link" aria-label="Direct link to How Triggers Work" title="Direct link to How Triggers Work" translate="no">​</a></h3>
<p>Triggers are the entry point for automated workflows. They listen for events in the system and fire workflows when conditions are met.</p>
<p><strong>Event sources</strong> define what happened: <code>node.create</code> (a new node was added to the graph), <code>node.update</code> (an existing node was modified), <code>file.upload</code> (a new source file was uploaded), <code>import.completed</code> (a batch import finished). The system ships with built-in triggers for auto-embedding -- every time a node is created or updated, a workflow automatically generates vector embeddings for it.</p>
<p><strong>Filters</strong> let you narrow the scope. A trigger on <code>node.create</code> with a filter <code>{"template_id": "person_template"}</code> only fires when a Person node is created, not when any node is created. Filters use exact key-value matching against the event data.</p>
<p><strong>Statistics tracking</strong> gives you visibility. Every trigger execution records success/failure status, execution time, and error messages. You can see your success rate, average execution time, and recent execution history -- useful for debugging workflows that occasionally fail.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="the-expose-as-ai-tool-feature">The "Expose as AI Tool" Feature<a href="https://chaoscypher.com/blog/workflow-automation-guide#the-expose-as-ai-tool-feature" class="hash-link" aria-label="Direct link to The &quot;Expose as AI Tool&quot; Feature" title="Direct link to The &quot;Expose as AI Tool&quot; Feature" translate="no">​</a></h3>
<p>Here's where things get composable. Any workflow can be exposed as a callable AI tool by setting <code>expose_as_ai_tool: true</code> and defining input/output schemas. Once exposed, that workflow appears alongside the built-in tools and can be used as a step in other workflows.</p>
<p>Think about what this enables. You build a workflow that extracts and validates medical terminology. You expose it as a tool. Now your "process research papers" workflow can call it as Step 3 instead of hardcoding medical-domain logic. You have a workflow that enriches person entities by cross-referencing external APIs? Expose it, and any other workflow can use it.</p>
<p>Workflows calling workflows. Each one focused on a single job, composed together into pipelines of arbitrary complexity. The step type <code>workflow</code> (alongside <code>system_tool</code> and <code>user_tool</code>) tells the engine to execute another workflow as a step, passing inputs and receiving outputs just like any other tool.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="workflow-portability">Workflow Portability<a href="https://chaoscypher.com/blog/workflow-automation-guide#workflow-portability" class="hash-link" aria-label="Direct link to Workflow Portability" title="Direct link to Workflow Portability" translate="no">​</a></h3>
<p>Workflows are portable. You can export any workflow to a version-stamped JSON file that includes the workflow definition, all its steps, and their configurations. Import it into another Chaos Cypher instance -- or share it with someone else running their own instance.</p>
<p>The import process is deliberate about safety. Before importing, the system validates the export version for compatibility and checks that all referenced tools exist in the target instance. It walks through every step, resolves each <code>tool_id</code> against the registry of system tools and user tools, and fails early if anything is missing. If a workflow references <code>ai.prompt</code> and <code>http.request</code>, those tools must be available. If a custom tool plugin is missing, the import fails with a clear error message rather than creating a broken workflow.</p>
<p>This design means workflows are self-describing and portable. The JSON file contains everything needed to reconstruct the workflow -- no hidden state, no implicit dependencies on database IDs. Export from your laptop, import on a server, share with a colleague. The only requirement is that the target instance has the same tools installed.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="try-it-yourself">Try It Yourself<a href="https://chaoscypher.com/blog/workflow-automation-guide#try-it-yourself" class="hash-link" aria-label="Direct link to Try It Yourself" title="Direct link to Try It Yourself" translate="no">​</a></h2>
<p>The fastest way to see the workflow engine in action is to look at the export format. Here's a minimal workflow that summarizes documents on upload:</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#c8c8e0;--prism-background-color:#0d0d1a"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#c8c8e0;background-color:#0d0d1a"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#c8c8e0"><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"version"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"1.0"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"workflow"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Summarize on Upload"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Auto-summarize new documents when uploaded"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"input_schema"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"object"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"properties"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"document_text"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token property" style="color:#c8c8e0">"type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"string"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"The document content to summarize"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"required"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token string" style="color:#39ff14">"document_text"</span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token property" style="color:#c8c8e0">"output_schema"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"object"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"properties"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"summary"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token property" style="color:#c8c8e0">"type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"string"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">          </span><span class="token property" style="color:#c8c8e0">"description"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Three-point summary of the document"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">}</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token property" style="color:#c8c8e0">"steps"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"step_number"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token number" style="color:#ff6d00">1</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"name"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Summarize Document"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"tool_type"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"system_tool"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"tool_id"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"ai.prompt"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token property" style="color:#c8c8e0">"configuration"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#808098">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"prompt"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"Summarize this document in 3 key points:\n\n{{inputs.document_text}}"</span><span class="token punctuation" style="color:#808098">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">        </span><span class="token property" style="color:#c8c8e0">"output_format"</span><span class="token operator" style="color:#ff2d95">:</span><span class="token plain"> </span><span class="token string" style="color:#39ff14">"text"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">      </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">    </span><span class="token punctuation" style="color:#808098">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain">  </span><span class="token punctuation" style="color:#808098">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#c8c8e0"><span class="token plain"></span><span class="token punctuation" style="color:#808098">}</span><br></div></code></pre></div></div>
<p>This is everything the system needs. The <code>version</code> field ensures forward compatibility. The <code>input_schema</code> and <code>output_schema</code> define the contract. The <code>steps</code> array contains the pipeline.</p>
<p>Each step specifies its <code>tool_type</code> (<code>system_tool</code>, <code>user_tool</code>, or <code>workflow</code>), a <code>tool_id</code> that references a registered tool, and a <code>configuration</code> object whose shape matches the tool's input schema. The <code>{{inputs.document_text}}</code> template variable gets resolved at execution time with the actual trigger data.</p>
<p>When importing, you have three options for handling name conflicts:</p>
<ul>
<li class=""><strong><code>fail</code></strong> -- refuse to import if a workflow with the same name exists (the default, prevents accidental overwrites)</li>
<li class=""><strong><code>skip</code></strong> -- silently keep the existing workflow and skip the import</li>
<li class=""><strong><code>rename</code></strong> -- import with <code> (imported)</code> appended to the name</li>
</ul>
<p>You can also import as inactive (<code>import_as_inactive: true</code>) to test a workflow before enabling it in production. This creates the workflow with <code>is_active: false</code>, letting you review the steps and do a manual test run before flipping it on.</p>
<p><img decoding="async" loading="lazy" alt="Settings page with import and export graph options" src="https://chaoscypher.com/assets/images/settings-general-95c668b12ae5b9ad1cb595b85162e6e9.png" width="1280" height="800" class="img_ev3q"></p>
<p><img decoding="async" loading="lazy" alt="Queue monitor with task tracking and auto-refresh" src="https://chaoscypher.com/assets/images/queue-monitor-06064e811e38be2c40b29ba6380f4163.png" width="1280" height="800" class="img_ev3q"></p>
<p>To set up the trigger, create a trigger record with the event source (like <code>file.upload</code>), link it to your workflow, and optionally add filters. The trigger system runs as a background event loop -- events are queued and processed asynchronously, so trigger evaluation never blocks the main API.</p>
<p>For more complex workflows, the step dependency system lets you control execution order beyond simple sequential numbering. Each step can declare <code>depends_on</code> (a list of step IDs that must complete before it runs) and <code>continue_on_error</code> (proceed even if the step fails). You can also set <code>retry_on_failure</code> to have the engine retry a step automatically, and <code>timeout_seconds</code> to cap how long any individual step can run. Combined with <code>logic.conditional</code> for branching and <code>logic.loop</code> for iteration, you can express surprisingly sophisticated pipelines.</p>
<p>The execution model tracks everything. Each workflow run produces an execution record with status (pending, running, completed, failed, cancelled), the inputs that were provided, the outputs that were produced, timing data for each step, and -- critically -- which step failed and why if something goes wrong. This execution history is what makes workflows debuggable. When a workflow fails at 3am, you don't have to guess what happened. You look at the execution detail, see that Step 3 timed out after 120 seconds waiting for the LLM, and adjust accordingly.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="whats-next">What's Next<a href="https://chaoscypher.com/blog/workflow-automation-guide#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next" translate="no">​</a></h2>
<p>The workflow engine is designed to grow. The tool system uses a plugin architecture -- the same pattern that powers Chaos Cypher's loader plugins, domain plugins, and LLM providers. Custom tool plugins in Python are on the roadmap for users who need capabilities beyond the built-in ten. Implement a class with <code>tool_id</code>, <code>input_schema</code>, <code>output_schema</code>, and an <code>execute</code> method, drop it in the plugins directory, and it auto-registers.</p>
<p>More trigger event sources are coming as the platform grows. Scheduling (run a workflow every Tuesday at 9am) and webhook triggers (fire a workflow from an external system) are natural extensions of the existing event-driven architecture.</p>
<p>If you've built an interesting automation workflow -- whether it's a multi-step research pipeline, a quality assurance checker, or an integration with external tools -- I'd genuinely like to hear about it. The export format makes sharing straightforward: export your workflow, share the JSON, and someone else can import it and adapt it to their use case. That's the whole point of portability.</p>
<p>For the full API reference and detailed configuration options, check out the <a class="" href="https://chaoscypher.com/docs/reference/api/workflows">workflow documentation</a>. The built-in system workflows (like auto-embedding on node create/update) are also good starting points -- export them and study the step configurations to see how the engine's own automation is wired together.</p>]]></content:encoded>
            <category>workflows</category>
            <category>tutorials</category>
        </item>
    </channel>
</rss>