LangGraph + Claude Agent SDK: The Ultimate Guide to Multi-Agent Systems in 2026

The agent era isn't coming. It's here. And if you're building serious agentic systems in 2026, two tools belong in your arsenal: LangGraph and Anthropic's Claude Agent SDK (which Anthropic just renamed from the Claude Code SDK — we'll get into that). This isn't a "hello world" primer. This is the guide I wish existed when I started going deep on these.

Let's go.

The Multi-Agent Landscape

Before we dive into the tools, let's zoom out. The market right now looks something like this:

Framework	Philosophy	Best For
LangGraph	Graph-based, low-level, model-agnostic	Complex workflows, durable execution, fine-grained control
Claude Agent SDK	Agent loop in a box, Claude-native	Code tasks, file ops, rapid multi-agent builds
AutoGen (Microsoft)	Conversation-driven, role-based agents	Research automation, conversational teams
CrewAI	Role-playing agent crews	Business process automation with natural-language roles
Haystack	Pipeline-based, retrieval-focused	RAG, search, document intelligence
Semantic Kernel	Microsoft-backed, .NET/Python	Enterprise teams already in Azure

The leaders are LangGraph and the Claude Agent SDK — one for control freaks (affectionate), one for velocity.

Part 1: LangGraph

What It Is

LangGraph is a low-level orchestration framework for stateful, long-running agents. It's built by the LangChain team but is fully independent — you don't need LangChain to use it. Companies like Klarna, Replit, and Elastic run it in production.

The mental model: your agent is a directed graph. Nodes are functions (think: LLM calls, tool invocations, conditional checks). Edges define flow. State is typed and persisted across the entire execution.

This is fundamentally different from a simple chain or loop. You get:

Branching — conditional logic based on state
Cycles — agents that loop back and retry
Checkpointing — resume from exactly where you left off after a crash
Human-in-the-loop — pause at any node, get a human decision, continue

Core Concepts

StateGraph

Everything starts with StateGraph. You define a typed state schema, add nodes, connect them with edges.

from langgraph.graph import StateGraph, MessagesState, START, END

def agent_node(state: MessagesState):
    # call your LLM here
    return {"messages": [{"role": "ai", "content": "..."}]}

def tool_node(state: MessagesState):
    # execute tools based on last message
    ...

graph = StateGraph(MessagesState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)

# conditional routing: if agent called a tool, go to tools; else end
graph.add_conditional_edges("agent", should_use_tools, {
    "tools": "tools",
    "end": END
})
graph.add_edge("tools", "agent")  # tools feed back into agent
graph.add_edge(START, "agent")

app = graph.compile()

That loop — agent → tools → agent — is the ReAct pattern. LangGraph makes it explicit and controllable.

Custom State

MessagesState is the default. But you can define any typed state:

from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    plan: str
    iterations: int
    approved: bool

State is immutable by merge — each node returns a partial update, LangGraph merges it. This makes concurrent subgraph execution safe.

Durable Execution

LangGraph's killer feature for production: checkpointing.

from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()  # or SqliteSaver, PostgresSaver
app = graph.compile(checkpointer=checkpointer)

# each run gets a thread_id — resume with same id after failure
config = {"configurable": {"thread_id": "run-42"}}
result = await app.ainvoke({"messages": [...]}, config=config)

If your agent crashes mid-execution? It resumes from the last checkpoint on retry. For long-running tasks (minutes, hours), this is non-negotiable.

Human-in-the-Loop

Pause execution at any node. Wait for human input. Continue.

from langgraph.graph import interrupt

def review_node(state: AgentState):
    # this pauses and surfaces the state to a human
    decision = interrupt({"plan": state["plan"], "action": "approve?"})
    return {"approved": decision["approved"]}

This maps directly to real-world workflows: code review approval, compliance check, content moderation. Not a hack — built into the framework at the runtime level.

Multi-Agent Patterns in LangGraph

LangGraph supports three main multi-agent topologies:

1. Supervisor Pattern A central supervisor routes tasks to specialized subagents:

def supervisor(state):
    # decide which agent handles next task
    next_agent = llm.invoke(routing_prompt + str(state))
    return {"next": next_agent}

builder = StateGraph(State)
builder.add_node("supervisor", supervisor)
builder.add_node("coder", coding_agent)
builder.add_node("researcher", research_agent)
builder.add_conditional_edges("supervisor", lambda s: s["next"], 
    {"coder": "coder", "researcher": "researcher", "FINISH": END})

2. Network Pattern Agents communicate peer-to-peer. Each agent decides where to route next.

3. Hierarchical Trees of agents — supervisors managing sub-supervisors managing workers. LangGraph handles the recursion.

LangSmith Integration

LangGraph's observability story is via LangSmith. Set two env vars:

export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY=your-key

Every graph execution is traced — node inputs/outputs, latency, token usage, state diffs. For debugging multi-agent systems, this is invaluable. You can't debug what you can't see.

When to Use LangGraph

You need precise control over agent behavior at each step
Your workflow has complex conditional logic (not just loops)
You need durable execution (long-running, resumable)
You're model-agnostic (OpenAI, Anthropic, Gemini, local models)
You need to pause for humans mid-execution
You're building a production system that needs to handle failures gracefully

Part 2: Claude Agent SDK

Wait — It Got Renamed

Hot off the presses: Anthropic just renamed the Claude Code SDK → Claude Agent SDK. If you're on old docs, you'll see migration notes. The packages:

# TypeScript
npm install @anthropic-ai/claude-agent-sdk

# Python
pip install claude-agent-sdk

The rebrand signals Anthropic's intent: this isn't just a tool for coding tasks. It's a general-purpose agent runtime.

What It Is

The Claude Agent SDK gives you the same agent loop, tools, and context management that power the Claude Code CLI — but programmable. You get an autonomous agent that can:

Read and write files
Run terminal commands
Search the web
Edit code with surgical precision
Spawn subagents for parallel work

The core API is dead simple:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Audit src/ for security vulnerabilities and write a report",
  options: { allowedTools: ["Read", "Grep", "Glob", "Write"] }
})) {
  if ("result" in message) console.log(message.result);
}

query() returns an async iterator that streams messages as Claude works. You get Claude's reasoning, tool calls, tool results, and the final output — all in one stream.

Built-in Tools

No glue code. No tool execution to implement yourself. These are ready:

Tool	What It Does
`Read`	Read any file in working dir
`Write`	Create new files
`Edit`	Make precise, surgical edits
`Bash`	Run terminal commands, scripts, git
`Glob`	Find files by pattern (`*/.ts`)
`Grep`	Regex search across file contents
`WebSearch`	Live web search
`WebFetch`	Fetch and parse web pages
`AskUserQuestion`	Ask for clarification with multiple choice
`Task`	Invoke a subagent

This is the same toolset powering Claude Code. Battle-tested, production-hardened.

Hooks — The Control Plane

Hooks let you intercept and modify agent behavior at key lifecycle points:

from datetime import datetime
from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher

async def audit_file_changes(input_data, tool_use_id, context):
    file_path = input_data.get("tool_input", {}).get("file_path", "unknown")
    with open("./audit.log", "a") as f:
        f.write(f"{datetime.now()}: modified {file_path}\n")
    return {}

async def block_dangerous_commands(input_data, tool_use_id, context):
    cmd = input_data.get("tool_input", {}).get("command", "")
    if "rm -rf" in cmd or "DROP TABLE" in cmd:
        return {"decision": "block", "reason": "dangerous command detected"}
    return {}

async for message in query(
    prompt="Refactor the auth module",
    options=ClaudeAgentOptions(
        permission_mode="acceptEdits",
        hooks={
            "PostToolUse": [HookMatcher(matcher="Edit|Write", hooks=[audit_file_changes])],
            "PreToolUse": [HookMatcher(matcher="Bash", hooks=[block_dangerous_commands])],
        }
    )
):
    ...

Available hooks: PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, UserPromptSubmit. This is where you enforce safety policies, log for compliance, or inject dynamic context.

Subagents — The Big One

This is where the Claude Agent SDK gets serious. You can define named subagents with specialized prompts and tool restrictions, and Claude orchestrates delegation automatically:

from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition

async for message in query(
    prompt="Review the auth module for security issues and test coverage",
    options=ClaudeAgentOptions(
        # Task tool is required — it's how Claude invokes subagents
        allowed_tools=["Read", "Grep", "Glob", "Bash", "Task"],
        agents={
            "security-reviewer": AgentDefinition(
                description="Expert in security vulnerabilities. Use for auth, crypto, injection risks.",
                prompt="""You are a security specialist. Review code for:
- SQL injection, XSS, CSRF
- Insecure crypto (MD5, SHA1 for passwords)
- Hardcoded secrets
- Auth bypass patterns
Be specific: file name, line number, severity.""",
                tools=["Read", "Grep", "Glob"],  # read-only — can't accidentally modify
                model="sonnet",
            ),
            "test-runner": AgentDefinition(
                description="Runs test suites and analyzes coverage. Use for test execution.",
                prompt="""You are a test specialist. Run tests, analyze output, report failures with context.""",
                tools=["Bash", "Read", "Grep"],
            ),
        }
    )
):
    ...

Claude reads the description fields and decides which subagent to delegate to. You can also request one explicitly: "Use the security-reviewer agent to audit auth.py."

Three ways to define subagents:

Programmatic (above) — recommended for SDK apps
Filesystem — markdown files in .claude/agents/ (for Claude Code projects)
Built-in — a general-purpose subagent always available via the Task tool

Permission Modes

ClaudeAgentOptions(permission_mode="acceptEdits")  # auto-approve file writes
ClaudeAgentOptions(permission_mode="default")       # ask before each tool use
ClaudeAgentOptions(permission_mode="bypassPermissions")  # full auto (use with care)

For CI/CD pipelines, acceptEdits or bypassPermissions makes sense. For interactive sessions, default keeps you in the loop.

Cloud Provider Support

The SDK supports multiple backends:

# Amazon Bedrock
CLAUDE_CODE_USE_BEDROCK=1

# Google Vertex AI
CLAUDE_CODE_USE_VERTEX=1

# Microsoft Azure AI Foundry
CLAUDE_CODE_USE_FOUNDRY=1

Enterprise teams already on AWS/GCP/Azure can use the SDK without hitting Anthropic's API directly. Data stays in your cloud.

When to Use the Claude Agent SDK

Your tasks are code-centric (read, edit, run, test)
You want a batteries-included agent with no tool implementation overhead
You need fast iteration (one query() call and you're running)
You want subagent delegation without building your own orchestration layer
You're all-in on Claude (not model-agnostic)
You're building CI/CD integrations, code review bots, dev tooling

Part 3: Using Them Together

Here's the move nobody talks about: LangGraph for the workflow skeleton, Claude Agent SDK for the heavy lifting inside nodes.

from langgraph.graph import StateGraph, START, END
from claude_agent_sdk import query, ClaudeAgentOptions

class ReviewState(TypedDict):
    pr_url: str
    security_findings: list
    test_results: str
    approved: bool

async def security_audit_node(state: ReviewState) -> dict:
    findings = []
    async for msg in query(
        prompt=f"Security audit the changes in PR: {state['pr_url']}",
        options=ClaudeAgentOptions(
            allowed_tools=["Read", "Grep", "Glob", "WebFetch"],
        )
    ):
        if hasattr(msg, "result"):
            findings.append(msg.result)
    return {"security_findings": findings}

async def test_runner_node(state: ReviewState) -> dict:
    output = ""
    async for msg in query(
        prompt="Run the test suite and report coverage",
        options=ClaudeAgentOptions(
            allowed_tools=["Bash", "Read"],
            permission_mode="bypassPermissions"
        )
    ):
        if hasattr(msg, "result"):
            output = msg.result
    return {"test_results": output}

# LangGraph orchestrates, Claude Agent SDK executes
builder = StateGraph(ReviewState)
builder.add_node("security_audit", security_audit_node)
builder.add_node("test_runner", test_runner_node)
builder.add_node("human_review", human_review_node)  # interrupt here

# run security + tests in parallel, then human review
builder.add_edge(START, "security_audit")
builder.add_edge(START, "test_runner")
builder.add_edge("security_audit", "human_review")
builder.add_edge("test_runner", "human_review")
builder.add_edge("human_review", END)

pipeline = builder.compile(checkpointer=checkpointer)

LangGraph gives you: parallelism, checkpointing, human approval gates, conditional routing. Claude Agent SDK gives you: autonomous code execution inside each node with zero boilerplate.

Lethal combination.

Framework Decision Matrix

Do you need multi-LLM support?
├── Yes → LangGraph (model-agnostic)
└── No → Either works

Are tasks primarily code/file operations?
├── Yes → Claude Agent SDK (batteries included)
└── No → LangGraph (build your own tools)

Need durable execution / checkpointing?
├── Yes → LangGraph (first-class checkpoint support)
└── No → Either works

Need human-in-the-loop pauses?
├── Yes → LangGraph (interrupt() built in)
└── No → Either works

Need to ship in < 1 day?
├── Yes → Claude Agent SDK (one query() call and you're running)
└── No → LangGraph (invest in the graph)

Building a production workflow with complex branching?
└── LangGraph + Claude Agent SDK nodes

Building a dev tool / CI integration?
└── Claude Agent SDK, probably with subagents

Quick Start Recipes

Claude Agent SDK — Code Review Bot in 20 lines

import { query } from "@anthropic-ai/claude-agent-sdk";
import { execSync } from "child_process";

const diff = execSync("git diff main HEAD").toString();

for await (const message of query({
  prompt: `Review this PR diff for bugs, security issues, and style problems:\n\n${diff}`,
  options: {
    allowedTools: ["Read", "Glob", "Grep"],
    systemPrompt: "You are a senior engineer. Be specific, be harsh, be helpful."
  }
})) {
  if ("result" in message) console.log(message.result);
}

LangGraph — Research Agent with Human Approval

from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import interrupt

def research_node(state):
    # run your LLM + web search tools here
    ...

def approval_node(state):
    decision = interrupt({"findings": state["findings"]})
    return {"approved": decision}

def publish_node(state):
    if state["approved"]:
        # push to CMS, Slack, wherever
        ...

graph = StateGraph(State)
graph.add_node("research", research_node)
graph.add_node("approve", approval_node)
graph.add_node("publish", publish_node)
graph.add_edge(START, "research")
graph.add_edge("research", "approve")
graph.add_conditional_edges("approve", 
    lambda s: "publish" if s["approved"] else END)

app = graph.compile(checkpointer=MemorySaver())

TL;DR

LangGraph = control plane for complex agentic workflows. Graph-based, durable, model-agnostic. Reach for it when you need to orchestrate across multiple agents, require human-in-the-loop, or need production-grade resilience.
Claude Agent SDK (formerly Claude Code SDK — just renamed) = execution engine for file/code/shell tasks. One function call. Rich built-in toolset. Subagents out of the box. Reach for it when you're building dev tooling or want a Claude-powered autonomous worker.
Use both when you want LangGraph's orchestration guarantees + Claude's execution power inside each node.

The multi-agent era demands frameworks, not just models. Learn the graph.

LangGraph + Claude Agent SDK: The Ultimate Guide to Multi-Agent Systems in 2026

The Multi-Agent Landscape

Part 1: LangGraph

What It Is

Core Concepts

StateGraph

Custom State

Durable Execution

Human-in-the-Loop

Multi-Agent Patterns in LangGraph

LangSmith Integration

When to Use LangGraph

Part 2: Claude Agent SDK

Wait — It Got Renamed

What It Is

Built-in Tools

Hooks — The Control Plane

Subagents — The Big One

Permission Modes

Cloud Provider Support

When to Use the Claude Agent SDK

Part 3: Using Them Together

Framework Decision Matrix

Quick Start Recipes

Claude Agent SDK — Code Review Bot in 20 lines

LangGraph — Research Agent with Human Approval

TL;DR

Tags

Article Info