Skip to content

Core Concepts

NimbleBrain’s core value is composing multiple MCP servers into a single, orchestrated workspace. Rather than connecting to one MCP server at a time, NimbleBrain aggregates tools from every installed app into a unified namespace, layers prompts to give the agent awareness of all available capabilities, and filters tool access per-task through skills.

┌──────────────────────────────────────────────────┐
│ ToolRegistry │
│ │
│ ┌─────────┐ ┌─────────┐ ┌──────────────────┐ │
│ │ App 1 │ │ App 2 │ │ App 3 (remote) │ │
│ │ (stdio) │ │ (stdio) │ │ (HTTP/SSE) │ │
│ │ 3 tools │ │ 5 tools │ │ 2 tools │ │
│ └─────────┘ └─────────┘ └──────────────────┘ │
│ │
│ Unified namespace: 10 app tools + system tools │
└──────────────────────────────────────────────────┘
│ │
▼ ▼
Skill filtering Agent Engine
(allowed-tools) (agentic loop)

Three mechanisms make this composable:

MechanismWhat it does
Tool aggregationThe ToolRegistry collects tools from every MCP source (local stdio, remote HTTP/SSE) and presents them as a flat namespace. The agent doesn’t know or care which server owns a tool.
Skill-scoped filteringEach skill declares allowed-tools glob patterns. When a skill matches, only tools matching those patterns (plus system tools) are visible to the agent. This scopes the agent’s capabilities per task.
4-layer prompt compositionSystem prompts are assembled from: (1) identity, (2) core context, (3) installed app metadata with trust scores, and (4) the matched skill’s prompt. Every layer adds awareness without the agent needing to discover it.

Every chat message triggers an agentic loop in the AgentEngine. The loop repeats until the model produces a response with no tool calls, or a limit is reached.

User message
┌─────────────────┐
│ Call Claude │◄─────────────────┐
│ (with tools) │ │
└────────┬────────┘ │
│ │
┌────┴────┐ │
│ Tool │ Yes │
│ calls? │──────► Execute tools │
│ │ in parallel ───┘
└────┬────┘
│ No
Return response

Each pass through the loop is one iteration. Defaults:

SettingDefaultHard cap
Max iterations1025
Max input tokens500,000
Max output tokens16,384
Modelclaude-sonnet-4-5-20250929

The engine stops for one of three reasons:

  • complete — the model responded without requesting any tool calls
  • max_iterations — the iteration limit was reached
  • token_budget — cumulative input tokens exceeded maxInputTokens. When this happens, tool calls from the current LLM response are dropped (not executed) to avoid running tools whose results can’t be processed.

All tool calls within a single iteration run in parallel via Promise.all().

Before the engine loop starts, the Runtime orchestrates the composition described above:

  1. Resolve or create a conversation
  2. Match the user message to a skill (triggers, then keywords)
  3. Compose the system prompt — 4-layer: identity + core skills + installed apps + matched skill
  4. Filter the unified tool namespace based on the matched skill’s allowed-tools globs
  5. Run the AgentEngine loop against the composed tool set
  6. Persist the conversation (JSONL or in-memory)

A bundle is an MCP server package installed into NimbleBrain. Bundles follow the MCPB format — each has a manifest.json declaring how to start the server, what tools it provides, and optional UI metadata.

Configure bundles in nimblebrain.json:

{
"bundles": [
{ "name": "@nimblebraininc/ipinfo" },
{ "path": "../my-local-server" },
{ "url": "https://mcp.example.com/mcp", "serverName": "remote" }
]
}
SourceHow it works
nameDownloaded from the mpak registry. The agent can install these at runtime via nb__manage_bundle.
pathLocal filesystem path to a bundle directory. Resolved relative to the config file.
urlRemote MCP server over Streamable HTTP or SSE transport.

Each bundle process goes through these states:

starting ──► running ──► crashed ──► dead
│ ▲
└──► stopped ─────────┘
StateMeaning
startingProcess is spawning
runningHealthy and serving tools
crashedProcess exited unexpectedly (will attempt recovery)
deadRepeated crashes, not restarting
stoppedManually stopped via API or CLI

NimbleBrain ships with @nimblebraininc/bash installed by default. This gives the agent basic shell access. Disable it with "noDefaultBundles": true in your config.

{
"name": "@nimblebraininc/postgres",
"env": { "DATABASE_URL": "postgres://localhost/mydb" },
"protected": true
}
FieldDescription
envEnvironment variables passed to the bundle process
protectedPrevents uninstall via the nb__manage_bundle system tool
trustScoreMTF trust score (0–100) from the mpak registry
uiUI metadata: name, icon, and optional primaryView resource URI

Skills are Markdown files with YAML frontmatter. They control what the agent knows and what tools it can use for a given message.

research.md
---
name: research
description: Deep research on a topic
version: "1.0"
type: skill
priority: 5
allowed-tools:
- "websearch__*"
- "nb__discover_tools"
metadata:
triggers:
- "research this"
- "deep dive"
keywords:
- research
- investigate
- analyze
- sources
- findings
---
You are a research agent. Search the web, gather multiple sources,
and synthesize findings into a structured report.
TypePriorityBehavior
context≤ 10Always included in the system prompt, regardless of what the user says
skillAnyMatched against each user message. Only the best match is used.

When a user sends a message, skills of type skill are matched in two phases:

Phase 1 — Triggers. Each skill’s triggers list is checked for a substring match against the message. First hit wins. Triggers are high-confidence explicit patterns.

Phase 2 — Keywords. If no trigger matched, each skill’s keywords are counted against the message. A skill needs at least 2 keyword hits to qualify. The skill with the most hits wins.

User: "research this topic and analyze the sources"
Phase 1: triggers
"research this" ── match! ──► research skill selected

A matched skill:

  • Injects its markdown body into the system prompt
  • Filters available tools to those matching its allowed-tools globs

Skills are loaded from three locations (in order):

  1. Built-insrc/skills/builtin/ (shipped with the package)
  2. Global~/.nimblebrain/skills/
  3. Config — directories listed in the skillDirs config option

Tools come from two places: MCP bundles and built-in system tools.

Every NimbleBrain instance has these nb__* tools available:

ToolDescription
nb__discover_toolsSearch available tools by keyword across all installed bundles
nb__discover_bundlesSearch the mpak registry for bundles to install
nb__manage_bundleInstall, uninstall, or configure a bundle
nb__bundle_statusGet version, health, tool count for installed bundles
nb__delegateSpawn a child agent for sub-tasks (multi-agent delegation)
nb__manage_skillCreate, update, or delete skill files at runtime

NimbleBrain uses tiered tool surfacing to keep the LLM’s tool list manageable:

ConditionWhat the LLM sees
Total tools ≤ 30All tools surfaced directly
Total tools > 30, no skill matchedOnly nb__* system tools. Others available via nb__discover_tools.
Skill matched with allowed-toolsTools matching the globs + system tools. Others via nb__discover_tools.

The threshold is configurable via maxDirectTools (default: 30).

The nb__delegate tool spawns a child AgentEngine run with a scoped system prompt and filtered tool set. Named agent profiles are configured in nimblebrain.json:

{
"agents": {
"researcher": {
"description": "Deep research agent",
"systemPrompt": "You are a research agent...",
"tools": ["websearch__*"],
"maxIterations": 8,
"model": "claude-sonnet-4-5-20250929"
}
}
}

The child agent’s iteration budget is capped at min(child.maxIterations, parent.remaining - 1). Multiple delegations in the same turn run concurrently.

Conversations persist across messages so the agent remembers context.

BackendDefault forLocation
JSONLCLI, HTTP server~/.nimblebrain/conversations/
In-memoryProgrammatic/SDK use

Each conversation is a single .jsonl file:

{"id":"conv_abc123","createdAt":"2025-03-15T10:30:00Z"}
{"role":"user","content":"What files are in my home directory?","ts":"..."}
{"role":"assistant","content":"Let me check...","ts":"...","toolCalls":[...]}

Line 1 is metadata (ID, creation timestamp). Lines 2+ are StoredMessage objects with role, content, timestamp, and optional tool call records.

Use --resume <id> in the CLI to continue an existing conversation:

Terminal window
nb --resume conv_abc123

Through the API, pass conversationId in the chat request body to continue an existing conversation.

NimbleBrain is a full MCP Apps host, implementing the ext-apps specification. MCP Apps are interactive UI applications — built with HTML/JavaScript — that render directly inside the host as sandboxed iframes.

Unlike traditional web apps, MCP Apps:

  • Live inside the conversation — no tab-switching, context stays together
  • Call tools bidirectionally — apps invoke MCP tools and receive pushed results from the agent
  • Inherit the host’s theme — CSS variables are injected automatically
  • Run in a security sandbox — no access to the parent page, cookies, or other apps

Any MCP bundle that declares UI resources and placements becomes an MCP App. See MCP Apps for the full concept and MCP App Bridge for the implementation protocol.

NimbleBrain exposes a /mcp endpoint that turns the entire platform into a Streamable HTTP MCP server. External MCP clients — Claude Code, Open WebUI, or another NimbleBrain instance — can connect and access all installed tools through a single endpoint.

This means NimbleBrain is both an MCP client (connecting to installed bundles) and an MCP server (exposing composed tools to external hosts). See MCP Endpoint for configuration and usage.