2. LLM Produce + Gather
This walkthrough introduces NeoGraph’s two primary LLM modes: produce (single structured call) and gather (ReAct tool loop with budget enforcement). Together they form the most common two-step pattern: decompose a problem, then research it.
The pipeline is built with @node-decorated functions and construct_from_module. Mode is inferred from the kwargs: prompt= + model= means LLM call. Add tools= and it becomes a gather loop.
The example uses fake LLM and tool implementations so you can run it without API keys. Replace them with real ones for production.
What you will learn
Section titled “What you will learn”- Configuring the LLM layer with
configure_llm() - Using
@nodewithprompt=andmodel=for LLM produce mode - Using
@nodewithmode="gather"andtools=for ReAct tool loops - Declaring tools with
Tool(name, budget) - Registering tool factories with
register_tool_factory() - Parameter-name wiring between LLM nodes
Schemas
Section titled “Schemas”from pydantic import BaseModel
class Requirement(BaseModel, frozen=True): text: str
class Claims(BaseModel, frozen=True): items: list[str]
class ResearchResult(BaseModel, frozen=True): findings: list[dict[str, str]]The decompose node has prompt= and model= but no mode= — NeoGraph infers mode="produce" (single LLM call, structured output). The function body is ... because the LLM handles execution.
The research node explicitly sets mode="gather" and adds tools=. It runs a ReAct loop: call the LLM, execute tool calls, feed results back, repeat until the budget is exhausted.
from neograph import node, Tool
@node(output=Claims, model="fast", prompt="req/decompose")def decompose() -> Claims: ...
@node( mode="gather", output=ResearchResult, model="reason", prompt="req/research", tools=[Tool(name="search_codebase", budget=2)],)def research(decompose: Claims) -> ResearchResult: ...The parameter name decompose on research wires it to the upstream decompose node.
The complete pipeline
Section titled “The complete pipeline”"""Produce + Gather: decompose a requirement, then research with tools.
Run: python 02_produce_and_gather.py"""
from __future__ import annotations
import sys
from langchain_core.messages import AIMessagefrom pydantic import BaseModel
from neograph import Tool, compile, configure_llm, construct_from_module, node, register_tool_factory, run
# -- Schemas ----------------------------------------------------------------
class Requirement(BaseModel, frozen=True): text: str
class Claims(BaseModel, frozen=True): items: list[str]
class ResearchResult(BaseModel, frozen=True): findings: list[dict[str, str]]
# -- Fake LLM (replace with real OpenRouter/OpenAI in production) -----------
class FakeDecomposeLLM: """Simulates an LLM that decomposes a requirement into claims.""" def with_structured_output(self, model): self._model = model return self
def invoke(self, messages, **kwargs): return self._model(items=[ "system shall authenticate users", "system shall log failed attempts", "system shall rate-limit login", ])
class FakeResearchLLM: """Simulates an LLM that calls a search tool, then responds.""" def __init__(self): self._call_count = 0
def bind_tools(self, tools): clone = FakeResearchLLM() clone._call_count = self._call_count clone._has_tools = len(tools) > 0 return clone
def invoke(self, messages, **kwargs): self._call_count += 1 if getattr(self, "_has_tools", True) and self._call_count <= 3: msg = AIMessage(content="") msg.tool_calls = [{ "name": "search_codebase", "args": {"query": f"claim-{self._call_count}"}, "id": f"call-{self._call_count}", }] return msg return AIMessage(content="research complete")
def with_structured_output(self, model): self._model = model return self
# -- Fake tool --------------------------------------------------------------
search_count = {"n": 0}
class FakeSearchTool: name = "search_codebase"
def invoke(self, args): search_count["n"] += 1 return f"Found 3 references for: {args.get('query', '?')}"
register_tool_factory("search_codebase", lambda config, tool_config: FakeSearchTool())
# -- Configure LLM layer ---------------------------------------------------
def llm_factory(tier): if tier == "fast": return FakeDecomposeLLM() return FakeResearchLLM()
configure_llm( llm_factory=llm_factory, prompt_compiler=lambda template, data: [{"role": "user", "content": "analyze"}],)
# -- Pipeline nodes ---------------------------------------------------------
@node(output=Claims, model="fast", prompt="req/decompose")def decompose() -> Claims: ...
@node( mode="gather", output=ResearchResult, model="reason", prompt="req/research", tools=[Tool(name="search_codebase", budget=2)],)def research(decompose: Claims) -> ResearchResult: ...
pipeline = construct_from_module(sys.modules[__name__], name="requirement-analysis")
# -- Run --------------------------------------------------------------------
if __name__ == "__main__": graph = compile(pipeline) result = run(graph, input={"node_id": "REQ-042"})
print(f"Decomposed into {len(result['decompose'].items)} claims:") for claim in result["decompose"].items: print(f" - {claim}")
print(f"\nSearch tool called {search_count['n']} times (budget was 2)") print(f"Research complete: {result['research'] is not None}")Expected output
Section titled “Expected output”Decomposed into 3 claims: - system shall authenticate users - system shall log failed attempts - system shall rate-limit login
Search tool called 2 times (budget was 2)Research complete: TrueMode inference
Section titled “Mode inference”NeoGraph infers the execution mode from the kwargs you pass to @node:
| Present kwargs | Inferred mode | Behavior |
|---|---|---|
prompt= + model= | produce | Single LLM call, structured JSON output |
prompt= + model= + tools= | gather | ReAct tool loop |
| Neither | scripted | Function body runs as-is |
You can also set mode= explicitly, which is required for gather if you want to be explicit about the distinction from produce.
Produce mode
Section titled “Produce mode”A produce node makes a single LLM call and expects structured JSON back. The framework calls llm.with_structured_output(output_model) and parses the response into your Pydantic schema automatically.
@node(output=Claims, model="fast", prompt="req/decompose")def decompose() -> Claims: ...No tool loop. No message history management. One call, one typed result.
Gather mode
Section titled “Gather mode”A gather node runs a ReAct loop: call the LLM, if it requests tool calls execute them, feed results back, repeat. The loop continues until the LLM responds without tool calls, or all tool budgets are exhausted.
@node( mode="gather", output=ResearchResult, model="reason", prompt="req/research", tools=[Tool(name="search_codebase", budget=2)],)def research(decompose: Claims) -> ResearchResult: ...Tool budget enforcement
Section titled “Tool budget enforcement”Each Tool has a budget — the maximum number of calls allowed. When a tool’s budget is exhausted, the framework removes it from the LLM’s available tools. When all budgeted tools are spent, the LLM is forced to produce a final response.
This prevents runaway loops and controls API costs. Set budget=0 for unlimited calls.
configure_llm()
Section titled “configure_llm()”Before using any LLM node, you must call configure_llm() with two functions:
configure_llm( llm_factory=llm_factory, # (tier) -> BaseChatModel prompt_compiler=prompt_compiler, # (template, input_data) -> list[BaseMessage])- llm_factory receives a tier string (
"fast","reason", etc.) and returns a LangChain chat model. In production, map tiers to real models (e.g.,"fast"-> GPT-4o-mini,"reason"-> Claude Sonnet). - prompt_compiler receives a template name and the typed input data, and returns a message list. This is where you build your prompts.
register_tool_factory()
Section titled “register_tool_factory()”Tool factories create tool instances at runtime. The factory receives the pipeline config and any per-tool config:
register_tool_factory( "search_codebase", lambda config, tool_config: MySearchTool(api_key=config["configurable"]["api_key"]),)This lets you create tools that depend on runtime context (API keys, rate limiters, database connections) without hardcoding them.
Documentation © 2025-2026 Constantine Mirin, mirin.pro. Licensed under CC BY-ND 4.0.