Skip to main content
The Brain Mesh is the physical compute layer Jarvis runs on. Five nodes work together to handle inference, orchestration, storage, and edge workloads — giving you dedicated, always-available AI infrastructure.

Node overview

ai-max

Primary compute node. Handles the most demanding inference workloads and hosts the majority of GPU-accelerated model runs.

ai-mini-x1

Secondary compute node. Runs inference in parallel with ai-max to distribute load and increase throughput.

jarvis-brain

Central orchestration node. Routes requests through LiteLLM, manages agents, and coordinates across the mesh.

dell-micro

Lightweight edge node. Handles low-latency, lightweight tasks without drawing on primary compute resources.

synologynas

Storage and NAS node. Persists model weights, vector stores, logs, and shared data across the mesh.

How nodes work together

Requests you send to Jarvis flow through a coordinated pipeline:
  1. jarvis-brain receives the request and routes it via LiteLLM based on the model and load conditions.
  2. ai-max and ai-mini-x1 run inference using their GPUs — two GPUs are available across the mesh for parallel workloads.
  3. dell-micro handles edge tasks — fast, low-resource completions that don’t need full GPU compute.
  4. synologynas provides shared storage that all nodes can read from and write to, including model weights and memory stores.
This architecture means a single request can span multiple nodes transparently. You interact with one endpoint; the mesh handles distribution.
You don’t need to target individual nodes for most tasks. LiteLLM routes requests automatically based on model availability and current load.

Node roles at a glance

NodeRoleGPUBest for
ai-maxPrimary computeYesLarge models, heavy inference
ai-mini-x1Secondary computeYesParallel inference, overflow load
jarvis-brainOrchestrationNoRouting, agents, coordination
dell-microEdge computeNoFast, lightweight completions
synologynasStorage/NASNoPersistence, model weights, memory

Check node status

You can check the status of each node and the overall mesh through the monitoring dashboard or via the API.
Send a GET request to the health endpoint to see which nodes are reachable:
curl https://your-jarvis-host/api/mesh/status
The response lists each node, its current availability, and active model assignments.

Route requests to a specific node

By default, LiteLLM selects the best node for each request. If you need to target a specific node — for example, to run on a GPU node or to avoid a node under heavy load — pass the node parameter in your request.
{
  "model": "llama3",
  "node": "ai-max",
  "messages": [{ "role": "user", "content": "Your prompt here" }]
}
Only specify a node when you have a reason to. Letting LiteLLM route automatically gives you better load distribution and higher availability.

Next steps

Models

See the full model fleet running across the mesh.

Inference

Learn how to send inference requests through LiteLLM.