Documentation
Design, generate, and run sovereign AI agents
Overview
Forge is an AI agent factory. You design an agent through a conversation with the Forge architect at myagentos.ai/create. The architect captures your context, wires the capabilities your agent needs, and emits a downloadable agent zip you run on your own infrastructure with your own keys.
Architect (web)
LLM-powered designer that runs the Phase 1.5 interview and emits a scout-config spec.
Generator (Python)
Reads the spec, builds the workspace, renders the runner template, zips the result.
Runtime (in your agent)
The vendored scout_runtime your agent ships with — REPL, gateway, capabilities.
BYOK end-to-end. You bring your own LLM keys, your own per-capability credentials, your own hardware. Forge writes the agent and gets out of the way. Every byte that runs your agent lives in the zip you downloaded — no phone-home, no telemetry, no vendor lock-in.
Getting Started
Build Your First Agent
Five steps from idea to a running agent on your machine.
- Go to myagentos.ai/create
- Enter your LLM API key (Anthropic, OpenAI, Gemini, or Grok)
- Describe what you want. The architect runs Phase 1.5 — a 5-wave context interview that captures who you are, what you want the agent to do, and how you want it to behave. This step is what makes the agent yours instead of generic.
- Confirm the design when the architect summarizes it back
- Click Build, download the zip
API Keys (BYOK)
Forge is bring-your-own-key end to end. There are two key surfaces to understand.
| Provider | Key format | Cost model |
|---|---|---|
| Anthropic | sk-ant-... | Pay per token (Claude family) |
| OpenAI | sk-... | Pay per token (GPT family) |
| Gemini | AI... | Free tier + paid |
| Grok (xAI) | xai-... | Pay per token |
| Custom (Ollama, llama.cpp) | Any / none | Free (local) |
The architect's key powers the design conversation at myagentos.ai. It only lives in your browser session — Forge never stores it server-side.
The agent's key powers your agent's reasoning at runtime. It goes in the .env file inside the downloaded zip.
Phase 1.5 Context Interview
This is what separates Forge from generic agent builders. Before the architect generates anything, it interviews you in 5 waves to capture 27 context fields. The answers get embedded into your agent's workspace files so the agent boots with you already known — no generic "How can I help you?" greeting.
user_name, user_role, user_communication_style, response_length_preference, and primary_objective. The other 22 are strongly encouraged but optional.Wave 1 — Who you are (8 questions)
Name, role, communication style, decision style, pet peeves, expertise areas, expertise gaps, timezone. Becomes your USER.md.
Wave 2 — Domain context (6 questions)
Primary objective, current workflow, pain points, ideal intervention point, escalation rules, audit requirements.
Wave 3 — Operational constraints (6 questions)
Data sources, output destinations, integration constraints, data sensitivity, budget, availability. Becomes your AGENTS.md.
Wave 4 — Voice (4 questions)
Voice role models, voice anti-patterns, humor tolerance, default response length. Becomes your SOUL.md.
Wave 5 — Hard rules (3 questions)
What the agent must NEVER do without confirmation, what it IS authorized to do unprompted, what triggers immediate escalation. Becomes your STANDING_ORDERS.md.
After your answers, the architect emits a scout-config JSON block. The web layer parses it and posts it to the generator, which embeds every captured field into your workspace files.
Your First Download
What you got is a self-contained Python package. Unzip, create a venv, install, fill in .env, run.
macOS / Linux
unzip agent.zip
cd agent
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .env
# edit .env to add ANTHROPIC_API_KEY (and any per-capability keys
# the architect wired — Slack, GitHub, Notion, etc.)
# REPL mode (default)
python -m <agent-name>
# Gateway mode (for scout-tui or other external clients)
python -m <agent-name> --gatewayWindows (PowerShell)
Expand-Archive agent.zip -DestinationPath .
cd agent
py -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
Copy-Item .env.example .env
# edit .env to add ANTHROPIC_API_KEY (and any per-capability keys
# the architect wired — Slack, GitHub, Notion, etc.)
notepad .env
# REPL mode (default)
py -m <agent-name>
# Gateway mode (for scout-tui or other external clients)
py -m <agent-name> --gatewaybrew install python or download from python.org. Linux: usually pre-installed; otherwise sudo apt install python3 python3-venv (Ubuntu/Debian) or sudo dnf install python3 (Rocky/Fedora). Windows: install from python.org with "Add to PATH" checked, or winget install Python.Python.3. If PowerShell blocks Activate.ps1, run Set-ExecutionPolicy -Scope CurrentUser RemoteSigned once.See Gateway server and scout-tui client for the multi-client setup.
Platform Support
Forge agents run on Linux, macOS, and Windows. Every generated agent ships with cross-platform code — the shell capability detects Windows and uses cmd.exe, the python capability uses whichever interpreter you're already running, and all file paths use pathlib.
| Platform | Status | Notes |
|---|---|---|
| Linux | ✓ Full | All 28 capabilities work. Tested target. |
| macOS | ✓ Full | All 28 work. TTS uses native say. |
| Windows 10/11 | ✓ Full | All 28 work. Use PowerShell commands above. |
| WSL on Windows | ✓ Full | If you prefer Unix tooling on a Windows box. |
The myagentos.ai website (architect chat, design, download) works in any modern browser — Chrome, Edge, Firefox, Safari — on any OS. Building the agent is OS-independent; only the "run the agent" step touches your machine, and that step has first-class support on all three.
.venv is optional — you can rebuild it with one command on the new machine). Workspace state (memory, drafts, logs) is plain markdown + sqlite — fully portable across OSes.The 28 Capabilities
Capabilities are the menu the architect proposes during design. Each one is a discrete piece of functionality — memory, scheduling, email, slack, web search, code execution, etc. — that gets wired into your agent if you opt in.
Core (5)
The primitives every agent should have — memory, scheduling, proactive behavior, procedural skills, user-set rules.
persistent_memoryPersistent memory (A-MEM)
Durable memory across sessions. Stored locally as embeddings + graph links via fastembed (ONNX, no PyTorch). Works air-gapped after first model fetch.
Use: Remembers that you prefer concise replies and that 'the Atlas project' is the migration plan you described last week.
scheduled_commitmentsScheduled follow-ups
The agent schedules its own follow-ups. When it decides 'I should check back in 3 days,' a commitment is queued and surfaced on the heartbeat. No external scheduler.
Use: On Monday you mention you're waiting on a vendor reply. On Thursday the agent surfaces the commitment and asks if you've heard back.
heartbeat_loopHeartbeat / proactive check-ins
Drives proactive behavior on a configurable interval. Without it the agent only acts when you speak first.
Use: Every hour the agent reviews commitments. Every morning it offers a one-line summary of overnight changes.
skillsSkills system (procedural memory)
The agent loads SKILL.md playbooks for recurring task types — trigger conditions, numbered steps, pitfalls. Required for any agent doing repeatable workflows.
Use: You say 'deploy the API.' Agent matches the 'deploy-api' skill, runs its steps, reports each outcome.
standing_ordersStanding orders
Mutable user-set runtime rules with top priority in system prompt assembly. Different from SOUL.md (voice) — these are hard constraints you add after the agent ships.
Use: "Always check Calendar before suggesting meeting times." "Never send emails after 8pm without confirmation."
Execution (5)
Letting the agent actually do things — spawn sub-agents, run code, orchestrate workflows, isolate risky operations.
subagent_spawningSub-agent orchestration
Spawn child agents for parallel or isolated work. Useful for reasoning-heavy subtasks that would flood the parent's context, or for running multiple workstreams concurrently.
Use: You ask for a research brief on 3 competitors. Agent spawns 3 sub-agents in parallel, parent synthesizes.
python_executionPython code execution
Write and run Python in a subprocess with the agent's installed packages available (pandas, numpy, requests, etc. if wired). 30s default timeout, 200KB output cap. Working directory isolated from the host.
Use: You paste a CSV and ask 'what's the trend in column 3?' Agent writes pandas code, runs it, surfaces the answer.
shell_executionShell command execution
Run shell commands in a sandboxed subprocess. Marked as a dangerous tool requiring explicit per-call approval when require_approval is set.
Use: You say 'show me git log for this repo.' Agent runs git log and surfaces the output.
flowsMulti-step workflows
Named flows that chain tool calls with explicit state, retries, and rollback. Checkpoint-persisted so they survive crashes.
Use: 'Incident-triage' flow: ack alert → fetch logs → look up runbook → page on-call → create post-mortem doc. Resumes from last checkpoint if interrupted.
sandboxSandboxed code execution
Isolated execution environment for python_execution and shell_execution. Limits filesystem access, network, CPU time, and memory. Per-capability config — sandbox python tightly while leaving shell more permissive, or vice versa.
Use: Agent runs user-submitted Python from chat. Sandbox limits to 30s, no network, read-only /tmp, 256MB RAM.
Communication (6)
Channels the agent can read from and write to — email, chat, SMS, webhooks, the gateway, voice.
email_imapEmail (IMAP read + SMTP send)
Read and send via standard IMAP/SMTP. Works with Gmail (app password), Fastmail, ProtonMail Bridge, self-hosted. Agent never sees the raw password.
Use: Agent watches a designated mailbox, forwards order confirmations to accounting, replies to common questions, escalates the rest.
slack_integrationSlack (post + read + DM)
Post messages, read channel history, send DMs. Uses Slack's bot API with a workspace token. The agent becomes a first-class Slack participant.
Use: Agent posts deploy notifications, monitors #help for FAQ-able questions, DMs on-call when alerts spike.
sms_messagingSMS (Twilio)
Send and receive SMS via Twilio. Good for low-bandwidth proactive notifications and users who prefer text over chat apps.
Use: Agent texts you when a long-running task finishes or when a calendar event is starting.
webhook_receiverHTTP webhook receiver
Exposes an HTTP endpoint for incoming webhooks. FastAPI + uvicorn. Useful for integrations with services that POST events.
Use: Agent receives GitHub PR webhooks and reviews each one. Or Stripe payment events and updates the accounting log.
gatewayHTTP + WebSocket gateway
Network surface for multi-client access. Exposes the agent via HTTP REST + WebSocket so external clients (scout-tui, browser dashboards, IDE plugins) can connect. Default bind 127.0.0.1:7891. Required if you want to reach the agent from anywhere other than the local REPL.
Use: You want the polished scout-tui experience. Agent starts the gateway on localhost:7891; scout-tui connects. Multiple clients can connect simultaneously.
ttsText-to-speech (voice output)
Voice output via OpenAI, ElevenLabs, MiniMax, Edge, or local `say` on macOS. Auto-routes based on configured backend. Audio streams to speakers, saves as files, or returns via the gateway.
Use: Heartbeat detects a critical alert. Agent speaks 'alert: production database CPU at 95% for 3 minutes' through speakers, then posts the same text to Slack.
Data (4)
Reading and writing data — local files, the live web, structured storage.
file_opsLocal file read/write
Read and write within a configured directory tree. Strict path-escape rejection — agent can't reach outside its sandbox. Configurable sandbox_root + max_file_size_kb.
Use: Agent maintains a project journal in ~/projects/notes/, summarizes meeting transcripts into structured notes, edits config files on request.
web_searchWeb search
Query the live web. Pluggable provider (Tavily, Brave, Exa, SerpAPI). The agent uses this when it needs current information beyond its training data.
Use: You ask "what was the latest funding round for Acme Corp?" Agent searches, reads top results, summarizes.
web_extractWeb page content extraction
Fetch a URL and extract its main content as clean markdown. httpx + markdownify. Falls back to raw HTML when the extractor can't find a main region.
Use: You share a URL. Agent fetches and summarizes, or extracts a specific data point.
sqlite_databaseLocal SQLite database
Maintain structured records in local SQLite. Schema and queries are agent-driven — you describe what to track and the agent picks the schema. Works air-gapped.
Use: Agent maintains an expense tracker: each receipt you mention gets logged with date, vendor, amount, category.
Integration (6)
Plugging the agent into the SaaS tools you already use.
github_integrationGitHub (issues + PRs + repos)
Read and write GitHub via REST + a PAT or GitHub App. Create/update issues, comment on PRs, read repo contents, run actions.
Use: Agent watches a repo's issues, auto-labels them by topic, drafts PR review comments, syncs your TODO list with the tracker.
calendar_integrationCalendar (Google + iCal)
Read events from Google Calendar or an iCal feed. Optionally create events. Useful for scheduling agents and agents that need to be aware of your day.
Use: Agent knows your meeting schedule and proactively asks if you want a prep brief 15 minutes before each meeting.
notion_integrationNotion (read + write pages)
Read and write Notion pages, databases, and properties. Good for agents that maintain knowledge bases or project trackers in Notion.
Use: Agent maintains a CRM-lite Notion database, adding rows when you mention new contacts and updating fields as it learns.
linear_integrationLinear (issue tracking)
Read and write Linear issues, comments, and projects via GraphQL. For engineering-focused agents.
Use: Agent triages new bugs into the right team's queue, drafts initial repro notes, links related issues.
acpACP (Agent Coordination Protocol)
Lets Claude Code, OpenAI Codex CLI, and other ACP-compatible tools drive your agent as a sub-agent. stdio + JSON-RPC — no HTTP overhead. Useful when you want the agent reachable inside your IDE workflow.
Use: You're in Claude Code and want your agent to handle code review on a side branch. Claude Code connects via ACP, hands off, surfaces findings in the same session.
pluginsPlugin system
Extensible plugin loader with cryptographic signing. External packages add custom tools, hooks, or workflows. Signed plugins verify against trusted keys; unsigned ones require explicit consent.
Use: You install a community-built 'jira-integration' plugin. It signs against a known key, gets auto-trusted, registers new tools.
Observability (2)
Seeing what your agent did, what it cost, and why.
conversation_loggingConversation logging
Every turn logged to disk as JSON lines. Useful for debugging, training-data collection, and audit trails.
Use: When the agent's behavior surprises you, you review the exact prompt + response in <state_dir>/conversations/.
metric_trackingUsage metrics (tokens + cost)
Tracks LLM token usage and estimated cost per session. Useful for BYOK spend monitoring and spotting runaway loops early.
Use: You ask "how much did our chat today cost?" Agent reports tokens by model and a dollar estimate.
Workspace Files
Every generated agent ships with a workspace/ directory holding five markdown files. They drive how the agent thinks, sounds, and behaves. You can read and edit any of them directly — they're yours.
USER.md
Stores: factual user profile (the Wave 1 + Wave 2 answers). Edited by: the agent updates it as it learns; you can edit directly anytime. Read: every turn, injected into the system prompt.
# USER.md
- Name: Dan
- Role: Runs the CIQ Startup Program. Serial founder; values speed and signal over polish.
- Communication style: terse
- Decision style: intuition-led, justifies after
- Pet peeves: "I'd be happy to help!", fake humility, em-dash overuse
- Expertise areas: GTM, startup ops, AI infra
- Expertise gaps: low-level Rust, GPU kernels
- Timezone: America/New_York
# Domain
- Primary objective: end-of-day inbox triage with draft replies
- Current workflow: scroll Gmail at 5pm, miss things, draft poorly under fatigue
- Pain points: response latency on key threads, dropped commitments
- Ideal intervention point: 5pm summary + drafts ready for reviewSOUL.md
Stores: the agent's voice, tone, and values (the Wave 4 answers). Edited by: set at design time; you edit if voice drifts. Read: every turn.
# SOUL.md
## Voice
Terse. Lead with the answer. No filler openings ("Great question!"),
no apologies for things that aren't apologies' fault.
## Role models
- Patrick Collison on Twitter — short, exact, no posturing
- A senior engineer who's seen this before and isn't impressed
## Anti-patterns
- "I'd be happy to help!"
- Excessive disclaimers
- LinkedIn-speak
- Em-dash overuse
## Humor
Dry. Occasional. Never forced.
## Default response length
1-3 sentences for casual questions. Long form only when explicitly asked.IDENTITY.md
Stores: agent name, archetype, catchphrases, emoji. Edited by: set at generation; rarely changes. Read: boot only.
# IDENTITY.md
- Name: Flint
- Archetype: laconic operator
- Catchphrase: "Acknowledged."
- Emoji: 🪨
- One-line self-description: end-of-day inbox triage, drafts ready by 5pmAGENTS.md
Stores: operational manual — how the agent behaves, which channels it reaches you on, hard ops rules (the Wave 3 answers). Edited by: set at generation; edit when ops change. Read: every turn.
# AGENTS.md
## Data sources
- Gmail (IMAP)
- Calendar (Google)
- Linear (read-only)
## Output destinations
- Draft emails to Gmail drafts folder
- 5pm summary to Slack DM
- Critical escalations via SMS
## Availability
Always-on background. Heartbeat every 30 minutes during business hours.
## Data sensitivity
Customer email contents — never log to conversation_logging, never include in summaries to third parties.
## Token budget
$5/day cap. Stop and alert if exceeded.STANDING_ORDERS.md
Stores: hard rules, auto-actions, escalation triggers (the Wave 5 answers). Only emitted if you captured these in Phase 1.5. Edited by: you, anytime. Top priority in system prompt. Read: every turn.
# STANDING_ORDERS.md
## Never without confirmation
- Send any outbound message (email, SMS, Slack post)
- Modify a calendar event
- Spend more than $1 of LLM tokens on a single task
## Authorized without asking
- Read inbox, calendar, Linear
- Draft emails (save to drafts only)
- Log activity to local SQLite
## Escalate immediately via SMS
- Any error in send path
- Token spend within 20% of daily cap
- Heartbeat missed for >2 hoursRunning Your Agent
One package, four ways to interact: REPL, gateway server, scout-tui client, or ACP integration into another CLI.
REPL (default)
The default mode. python -m <agent> opens an interactive prompt. Type a message, agent replies in its configured voice. Ctrl-D to exit. Type /help to see slash commands.
$ python -m flint
flint v0.1 — laconic operator. Acknowledged.
Workspace loaded: USER.md, SOUL.md, IDENTITY.md, AGENTS.md, STANDING_ORDERS.md
Capabilities: 28 wired
Provider: anthropic / claude-opus-4-8
> what's on my plate today
3 threads waiting >24h. 1 calendar conflict at 14:00. Drafts ready in Gmail.
> draft a reply to the vendor email
Drafted. Saved to Gmail drafts. 4 lines, declines politely, asks for revised quote by Friday.
>Model selection
Three ways to pick which model your agent runs. They compose — design-time choice flows into runtime, and runtime overrides whatever was baked in.
1. At design time (myagentos.ai)
When you enter your API key, the modal shows a model picker for the detected provider. Pick once, save with Remember, and that choice powers both Phase 1.5 AND your generated agent's .env.example as a pre-filledLLM_MODEL= line.
claude-opus-4-8 for Anthropic, gpt-5 for OpenAI, gemini-2-5-pro for Gemini. The picker auto-detects your provider from the key prefix.2. Before boot (.env)
Set LLM_MODEL=<model-id> in your agent's.env file. Generated agents ship with this line pre-filled if you picked a model at design time; you can edit it any time. The runner reads it before instantiating the provider.
.env
# LLM Provider
ANTHROPIC_API_KEY=sk-ant-...
LLM_MODEL=claude-opus-4-8
GATEWAY_AUTH_TOKEN=flint-dev-token-123
...3. At runtime (REPL slash command)
Swap models mid-conversation. /model shows the current model + everything your provider supports./model <name> switches. The next turn uses the new model; conversation history is preserved.
> /model
Provider: anthropic
Current model: claude-opus-4-8
Available:
* claude-opus-4-8
claude-opus-4-7
claude-sonnet-4-5
claude-sonnet-4-20250514
claude-haiku-4-5
Switch with: /model <name>
> /model claude-haiku-4-5
Model switched to: claude-haiku-4-5
> quick — what's blocking the launch?
Three things: vendor contract (signed yesterday), legal review (waiting on
Sarah), and staging deploy (in CI, 12 minutes left).
Useful patterns: drop to Haiku for cheap sub-agent dispatch, step up to Opus for complex synthesis turns, A/B test providers without restarting your agent.
Gateway server
For multi-client access — scout-tui, browser dashboards, IDE plugins. --gateway starts a uvicorn server on 127.0.0.1:7891 by default with HTTP routes for /v1/health, /v1/hello, /v1/rpc and a WebSocket at /v1/ws for chat.
$ python -m flint --gateway
[flint] loading workspace…
[flint] wiring capabilities (28)…
[flint] gateway listening on http://127.0.0.1:7891
[flint] WARNING: no GATEWAY_AUTH_TOKEN set; gateway accepts unauthenticated connections.
[flint] press Ctrl-C to stopSet GATEWAY_AUTH_TOKEN in .env to require token auth. Override the bind with GATEWAY_HOST and GATEWAY_PORT. Set GATEWAY_HOST=0.0.0.0 only behind a real auth token — never expose the gateway publicly without one.
scout-tui client
A separate Ink/React terminal client that connects to a running agent over the gateway WebSocket. Polished banner, connection status, command history.
# In a terminal where the agent gateway is reachable
export SCOUT_GATEWAY_URL=ws://127.0.0.1:7891/v1/ws
export SCOUT_GATEWAY_TOKEN=<same token as agent's GATEWAY_AUTH_TOKEN>
# Run the TUI
node dist/entry.js
# (or: npm install -g scout-tui, then: scout-tui)Multiple TUI clients can connect to the same gateway — useful for screen-sharing the agent during a call.
ACP integration
Agent Coordination Protocol lets other CLIs — Claude Code, Codex, OpenCode — drive your agent as a sub-agent via stdio + JSON-RPC. Run the agent in ACP mode and configure the calling CLI to spawn it.
# Run the agent as an ACP server over stdio
python -m flint --acp --stdio
# In Claude Code (~/.claude.json), register flint as an ACP agent
# (advanced — see the ACP capability docs in your agent's workspace/)Deployment
Five ways to run your agent. All start from the same downloaded zip.
Local Python
Python 3.10+. Simplest option.
Container
Rocky Linux 9. ~180MB. Podman or Docker.
Sovereign
Bundled LLM. No internet needed.
Local Python
Fastest path to running your agent. Requires Python 3.10+.
unzip agent.zip && cd agent
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .env
# Add your API keys
# REPL
python -m <agent-name>
# Gateway (for scout-tui)
python -m <agent-name> --gatewayRocky Linux Container
Requires Podman (recommended) or Docker. The container ships with Python 3.11 and all wired-capability deps pre-installed.
unzip agent.zip && cd agent
cp .env.example .env
# Add your API keys
# Build the Rocky Linux container (~180MB)
./container/build.sh
# Run with the REPL attached
./container/run.sh --env ./.env
# Run as a gateway daemon
./container/run.sh --env ./.env --gateway --detachFully Sovereign
Bundles Ollama and a local LLM into the container image. Once built, no internet is needed. No API keys for the LLM. No data leaves your machine.
unzip agent.zip && cd agent
# Build with bundled model (~2-6GB image)
./container/build.sh --sovereign --model llama3.2:3b
# Run — fully offline from this point
./container/run.sh --env ./.env| Model | Size | RAM | Speed (CPU) | Best for |
|---|---|---|---|---|
| llama3.2:3b | ~2GB | 4GB+ | ~10 tok/s | Most agents, fast, lightweight |
| phi3:mini | ~2.3GB | 4GB+ | ~8 tok/s | Strong reasoning |
| mistral:7b | ~4GB | 8GB+ | ~5 tok/s | Best quality on CPU |
| llama3.1:8b | ~4.7GB | 8GB+ | ~4 tok/s | Newest Llama |
| gemma2:9b | ~5.4GB | 12GB+ | ~3 tok/s | Google's best small model |
Cloud Deployment
Deploy to any cloud provider. You don't need your own hardware.
CPU VPS ($5-10/month)
Best for most agents. A 3B sovereign model runs fine on CPU. Works on Hetzner, DigitalOcean, Vultr, Linode.
# Build locally
./container/build.sh --sovereign --model llama3.2:3b
# Save and copy to VPS
podman save my-agent:sovereign -o my-agent.tar
scp my-agent.tar .env user@your-vps:~/
# On the VPS
ssh user@your-vps
podman load -i my-agent.tar
podman run -d --name my-agent --restart=always \
-v ~/.env:/app/.env:ro \
-v ~/data:/app/data \
-p 127.0.0.1:7891:7891 \
my-agent:sovereign --gatewayGPU Cloud
For 7B+ models or low-latency needs. Lambda Labs ($0.80/hr), RunPod ($0.39/hr), Vast.ai ($0.15/hr).
podman push my-agent:sovereign ghcr.io/yourname/my-agent:sovereign
# On GPU instance
podman pull ghcr.io/yourname/my-agent:sovereign
podman run -d --name my-agent \
--device nvidia.com/gpu=all \
-v ./.env:/app/.env:ro \
my-agent:sovereign --gatewayAir-Gapped
For classified or disconnected environments. Build on an internet-connected machine; transport via secure media.
# On internet-connected machine
./container/build.sh --sovereign --model llama3.2:3b
podman save my-agent:sovereign -o my-agent.tar
# Copy my-agent.tar + .env to USB
# On air-gapped machine
podman load -i my-agent.tar
podman run -d --name my-agent \
-v ./.env:/app/.env:ro \
my-agent:sovereignConfiguration
.env file
The generated .env.example is grouped by category: provider keys, per-capability credentials, gateway settings, sandbox config, observability. Copy it to .env and fill in what your agent needs.
# ─── Provider ───────────────────────────────────────────
ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=AI...
# ─── Gateway (for --gateway mode) ──────────────────────
GATEWAY_HOST=127.0.0.1
GATEWAY_PORT=7891
GATEWAY_AUTH_TOKEN=change-me-to-something-long
# ─── Per-capability credentials ─────────────────────────
# Slack
SLACK_BOT_TOKEN=xoxb-...
# Email
IMAP_HOST=imap.gmail.com
IMAP_USER=you@gmail.com
IMAP_PASSWORD=<app-password>
# GitHub
GITHUB_TOKEN=ghp_...
# Notion
NOTION_API_KEY=secret_...
# Linear
LINEAR_API_KEY=lin_api_...
# Calendar
GOOGLE_CALENDAR_CREDENTIALS=./credentials.json
# Twilio
TWILIO_ACCOUNT_SID=AC...
TWILIO_AUTH_TOKEN=...
TWILIO_FROM_NUMBER=+1...
# ─── Web search (optional) ──────────────────────────────
# TAVILY_API_KEY=tvly-...
# BRAVE_SEARCH_API_KEY=...
# ─── Observability ──────────────────────────────────────
# LOG_LEVEL=INFOComments in .env.example link to where each provider issues keys. The generator pulls these from each capability's env_docs_urls field.
Sovereign Models
In sovereign mode you bundle a model into the container image. The model runs locally via Ollama. No runtime LLM key needed.
# Common choices
./container/build.sh --sovereign --model llama3.2:3b # Fast, lightweight
./container/build.sh --sovereign --model phi3:mini # Strong reasoning
./container/build.sh --sovereign --model mistral:7b # Best quality on CPU
./container/build.sh --sovereign --model llama3.1:8b # Newest Llama--model.Architecture
Three components, one direction: web designs → Python generates → your runtime runs.
Web
Architect chat → scout-config
Generator
Workspace + zip
Runtime
Your agent runs
1. Web (myagentos.ai)
The architect chat is a Next.js app backed by an LLM (the user-provided key). It runs the Phase 1.5 interview, makes capability decisions, then emits a scout-config JSON block. spec-bridge.ts parses the block out of the chat stream and posts it to /api/generate-agent.
2. Generator (Python)
scout_architect validates the spec. The F30 capability-decision gate rejects unaccounted-for catalog capabilities. The F31 context gate rejects specs missing the 5 required Phase 1.5 fields. scout_generator builds the workspace (USER.md, SOUL.md, IDENTITY.md, AGENTS.md, STANDING_ORDERS.md), renders the runner template, vendors scout_runtime into _vendored/, and zips it.
3. Runtime (in your agent)
Every agent ships with scout_runtime vendored — no external runtime dependency. On boot it loads the workspace, wires the capabilities the spec requested, and starts either the REPL or the gateway depending on the CLI flag. You own everything in the zip.
Troubleshooting
"AnthropicProvider requires an api_key"
Set ANTHROPIC_API_KEY in .env. Or use --provider with the env var name to point at a different key (F45).
"task executor not available"
Your gateway is missing the CLI handler. Re-download a fresh agent post-PR #73 (F47). The fix wires TaskExecutor + the CLI handler into the gateway during boot.
WebSocket 404 on /v1/ws
pip install 'uvicorn[standard]'You're missing the WebSocket extras (F46). The standard uvicorn install ships without them.
"no gateway token found" / 403 from scout-tui
Set GATEWAY_AUTH_TOKEN in the agent's .env and SCOUT_GATEWAY_TOKEN in the env where you run scout-tui. They must match.
"FINALIZE BLOCKED" in architect chat
Missing Phase 1.5 fields or capability decisions. The architect will list exactly which (F30 = capability decisions, F31 = context fields). Answer the missing questions or explicitly omit the missing capabilities and retry.
ModuleNotFoundError on agent boot
Either pip install -e . wasn't run inside the venv, or a stray NODE_ENV=production in your environment poisoned an unrelated npm step (relevant only for the web layer; agents are pure Python).
Container exits immediately
podman logs <agent-name>Usually missing .env, invalid API key, or an import error from a capability whose extras weren't installed. Logs will name it.
Sovereign: Ollama won't start
The bundled model needs to fit in RAM. A 3B model needs 4GB+, a 7B needs 8GB+. Try a smaller model:
./container/build.sh --sovereign --model llama3.2:3bVercel deploy timing
The web layer (myagentos.ai/create) deploys in ~60-90s from merge to live. If you just published a PR and don't see the change yet, give it the full window.