Skip to content
Built by Postindustria. We help teams build agentic production systems.

LLM-Driven Pipelines

The @node decorator and ForwardConstruct are for humans writing source code. But pipelines don’t have to be defined at source-code time — they can be assembled at runtime by an LLM, a config file, or a routing layer. This is where the programmatic API (Node + Construct + | pipe) earns its place as a first-class surface, not a legacy leftover.

An LLM is the graph architect. You give it a high-level goal, expose a schema for what a pipeline looks like (nodes, modes, modifiers), and it emits a structured spec via tool calling or JSON mode. Your runtime parses the spec, constructs Nodes, chains modifiers, and calls compile() + run(). The validator catches malformed specs before anything executes.

This is fundamentally different from hardcoded pipelines. The LLM can:

  • Pick which nodes to include based on the input
  • Decide whether to run an ensemble or a single pass
  • Add fan-out when the input is a collection
  • Insert human-in-the-loop checks for sensitive outputs
  • Skip steps that aren’t needed for this particular request

Suppose you’ve given an LLM a prompt like “Given this user request, design a NeoGraph pipeline to handle it. Return JSON matching this schema: …” The LLM returns:

llm_output = {
"name": "requirements-analysis",
"nodes": [
{
"name": "decompose",
"mode": "think",
"output": "Claims",
"prompt": "rw/decompose",
"model": "reason",
"modifiers": [
{"type": "Oracle", "n": 3, "merge_fn": "merge_claims"}
],
},
{
"name": "verify",
"mode": "agent",
"output": "MatchResult",
"prompt": "match/verify",
"model": "fast",
"tools": ["search_codebase"],
"modifiers": [
{"type": "Each", "over": "decompose.items", "key": "label"}
],
},
{
"name": "report",
"mode": "think",
"output": "Report",
"prompt": "rw/report",
"model": "fast",
},
],
}

Your runtime turns it into a pipeline:

from neograph import (
Node, Tool, Construct, Oracle, Each, Operator, compile, run,
)
# Your type registry (schemas the LLM can reference by name)
TYPES = {"Claims": Claims, "MatchResult": MatchResult, "Report": Report}
TOOLS = {"search_codebase": Tool("search_codebase", budget=5)}
def build_node(spec):
n = Node(
spec["name"],
mode=spec["mode"],
outputs=TYPES[spec["output"]],
prompt=spec.get("prompt"),
model=spec.get("model"),
tools=[TOOLS[t] for t in spec.get("tools", [])],
)
for mod_spec in spec.get("modifiers", []):
match mod_spec["type"]:
case "Oracle":
n = n | Oracle(
n=mod_spec["n"],
merge_fn=mod_spec.get("merge_fn"),
merge_prompt=mod_spec.get("merge_prompt"),
)
case "Each":
n = n | Each(over=mod_spec["over"], key=mod_spec["key"])
case "Operator":
n = n | Operator(when=mod_spec["when"])
return n
def build_pipeline(llm_output):
nodes = [build_node(spec) for spec in llm_output["nodes"]]
return Construct(llm_output["name"], nodes=nodes)

And runs it:

pipeline = build_pipeline(llm_output)
graph = compile(pipeline)
result = run(graph, input={"node_id": "user-request-42"})

The pipe operator composes modifiers at runtime without modules, function signatures, or class definitions. The LLM emits a list of modifier dicts, and your runtime chains them with a simple for loop:

for mod_spec in spec.get("modifiers", []):
n = n | build_modifier(mod_spec)

This is impossible with @node — the decorator needs a function at source-code time, with a specific signature. ForwardConstruct is similar — it needs a class definition. The programmatic Node(...) | Modifier(...) form works with a dict as input, no Python syntax required.

When you call Construct(name, nodes=[...]), NeoGraph validates the whole chain before returning:

  • Every node’s declared input is type-checked against upstream output types
  • Modifier chains are validated (e.g., Each requires a dotted path that resolves to list[X] where X matches the downstream input)
  • Fan-in parameters are type-checked across all upstreams
  • Cycles and self-dependencies raise ConstructError

If the LLM emits a malformed spec — say, verify consumes MatchResult but the upstream actually produces dict[str, MatchResult] via the Each modifier — the validator catches it with a clear error pointing at the broken edge. You surface that error back to the LLM, and it can revise the spec. The graph never executes a malformed pipeline.

The cleanest integration is to expose build_pipeline as a tool the LLM can call:

from neograph import tool
@tool
def construct_pipeline(spec: dict) -> str:
"""Build and validate a NeoGraph pipeline from a spec dict.
Returns a pipeline ID on success, or a validation error on failure.
"""
try:
pipeline = build_pipeline(spec)
pipeline_id = register(pipeline) # your storage
return f"Pipeline '{spec['name']}' built successfully (id={pipeline_id})"
except ConstructError as e:
return f"Validation failed: {e}"

The LLM calls construct_pipeline with a JSON spec. If the validator rejects it, the error comes back as the tool’s return value — the LLM sees the exact problem and can try again. If it succeeds, the LLM calls a second tool to dispatch: run_pipeline(pipeline_id, input={...}).

This gives you a self-correcting loop: the LLM proposes a pipeline, validation either accepts or rejects with a diagnostic, and the LLM iterates until the spec is correct.

The tool-calling pattern above gives the LLM full control over the Python API. For a more constrained approach, the LLM generates a YAML or JSON spec and load_spec handles the rest:

name: requirements-analysis
nodes:
- name: decompose
mode: think
prompt: "Decompose the requirements into discrete claims."
model: reason
outputs: Claims
oracle:
n: 3
merge_fn: merge_claims
- name: verify
mode: agent
prompt: "Verify this claim against the codebase."
model: fast
outputs: MatchResult
tools: [search_codebase]
each:
over: decompose.items
key: label
- name: report
mode: think
prompt: "Produce the final analysis report."
model: fast
outputs: Report
pipeline:
nodes: [decompose, verify, report]
from neograph import load_spec, compile, run
construct = load_spec(yaml_string) # or a file path, or a dict
graph = compile(construct)
result = run(graph, input={"node_id": "user-request-42"})

load_spec parses the spec, resolves type names from the registry, applies modifiers (Oracle, Each, Loop, Operator), validates the full chain, and returns a Construct. The same validator that checks @node and programmatic pipelines runs on the result — malformed specs get a clear error before anything executes.

The spec format supports everything the programmatic API does: LLM modes, fan-out, ensembles, loops, sub-constructs with isolated state, and human-in-the-loop interrupts. See the Pipeline Spec Format reference for the full schema and examples.

The tool-calling pattern exposes the full Python API. The spec format is more constrained:

  • Validated schema: the spec is checked against a JSON Schema at parse time, before any Python runs
  • LLM-friendly: YAML is a natural output format for LLMs — no Python syntax, no imports, no class definitions
  • Auditable: the YAML spec is a readable, diffable artifact that can be logged, versioned, and reviewed
  • Self-correcting: when validation fails, the error message tells the LLM exactly what to fix

Use tool calling when the LLM needs full flexibility (dynamic modifier logic, custom scripted functions). Use specs when the pipeline topology is the main decision and you want a clean contract between the LLM and the runtime.

The same pattern works when pipelines come from YAML, JSON config files, or a database. With load_spec, there’s no custom build_pipeline function needed:

from neograph import load_spec, compile, run
construct = load_spec("pipelines/daily-ingestion.yaml")
graph = compile(construct)
run(graph, input={"node_id": "daily-001"})

For the tool-calling approach, the runtime logic is identical to the LLM case — only the source of the spec changes:

import yaml
with open("pipelines/daily-ingestion.yaml") as f:
spec = yaml.safe_load(f)
pipeline = build_pipeline(spec)
graph = compile(pipeline)
run(graph, input={"node_id": "daily-001"})

Four paths to the same compiler:

  • @node — humans writing pipelines in source code
  • ForwardConstruct — humans writing branching logic as Python
  • Node + Construct + | — LLMs and config systems building pipelines via tool calling
  • load_spec — LLMs and config files describing pipelines as YAML/JSON

The runtime APIs are first-class citizens. They’re the path for every use case where “who writes the pipeline” isn’t a human at a keyboard.


Documentation © 2025-2026 Constantine Mirin, mirin.pro. Licensed under CC BY-ND 4.0.