Q AI — Architecture

This document covers the internal architecture of Q AI, including the agentic loop, model selection strategy, tool routing, and deployment topology.

System Architecture

                        ┌─────────────────────────┐
                        │      Load Balancer       │
                        │    (nginx / ALB)         │
                        └───────────┬─────────────┘
                                    │
                        ┌───────────▼─────────────┐
                        │      Q AI API            │
                        │      (:9100)             │
                        ├─────────────────────────┤
                        │  ┌───────────────────┐  │
                        │  │   Router           │  │
                        │  │  /chat             │  │
                        │  │  /chat/stream      │  │
                        │  │  /chat/confirm     │  │
                        │  │  /tools            │  │
                        │  └─────────┬─────────┘  │
                        │            │             │
                        │  ┌─────────▼─────────┐  │
                        │  │  Agentic Loop      │  │
                        │  │  ┌─────────────┐   │  │
                        │  │  │ Provider    │   │  │
                        │  │  │ Selector    │   │  │
                        │  │  └──────┬──────┘   │  │
                        │  │         │          │  │
                        │  │  ┌──────▼──────┐   │  │
                        │  │  │ Tool Router │   │  │
                        │  │  └──────┬──────┘   │  │
                        │  │         │          │  │
                        │  │  ┌──────▼──────┐   │  │
                        │  │  │ Confirm Gate│   │  │
                        │  │  └─────────────┘   │  │
                        │  └───────────────────┘  │
                        └──┬───────┬──────┬───────┘
                           │       │      │
              ┌────────────▼┐  ┌───▼────┐ ┌▼──────────┐
              │ MCP Server  │  │Gateway │ │ LLM APIs  │
              │ (16 tools)  │  │(:9099) │ │ (Claude,  │
              └──────┬──────┘  └───┬────┘ │  GPT...)  │
                     │             │      └───────────┘
              ┌──────▼──────┐  ┌───▼─────────┐
              │ MEV Engine  │  │ Chain RPCs   │
              │ (:8080)     │  │ (ETH, BSC..) │
              └─────────────┘  └──────────────┘

Agentic Loop

The agentic loop is the core processing pipeline for every Q AI request:

┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│  Input   │────►│ Context  │────►│   LLM    │────►│  Tool    │
│ Parser   │     │ Builder  │     │ Provider │     │ Executor │
└──────────┘     └──────────┘     └──────┬───┘     └────┬─────┘
                                         │              │
                                         │    ┌─────────▼──────┐
                                         │    │  Result Buffer  │
                                         │    └─────────┬──────┘
                                         │              │
                                         ◄──────────────┘
                                    (loop until done)
                                         │
                                  ┌──────▼───────┐
                                  │  Response    │
                                  │  Synthesizer │
                                  └──────────────┘

Step-by-Step

  1. Input Parser -- validates the incoming JSON-RPC request, extracts the user message, conversation ID, and role.

  2. Context Builder -- assembles the LLM prompt:

    • System prompt with MEV domain knowledge
    • Conversation history (up to context window limit)
    • Available tools filtered by user role (RBAC)
    • Current engine state summary (optional, for complex queries)
  3. LLM Provider -- sends the assembled prompt to the selected model. The provider returns either a text response or one or more tool calls.

  4. Tool Executor -- if the LLM requests tool calls:

    • Validates tool names and parameters
    • Checks RBAC permissions
    • For mutating tools: pauses and returns a confirmation request to the user
    • For read-only tools: executes immediately
    • Returns results to the LLM for the next iteration
  5. Result Buffer -- accumulates tool results. The loop continues (steps 3-4) until the LLM produces a final text response with no further tool calls. Typically 1-3 iterations.

  6. Response Synthesizer -- formats the final response for the client (JSON for API, markdown for chat, structured data for SDKs).

Confirmation Gate

Mutating tools require explicit user confirmation:

// Q AI response when a mutating tool is requested
{
  "type": "confirmation_required",
  "action": "bundle_submit",
  "params": {
    "txs": ["0xabc...", "0xdef..."],
    "chain": "ethereum",
    "blockNumber": 19482300
  },
  "message": "I'll submit this 2-tx bundle targeting block 19,482,300 on Ethereum. Proceed?",
  "confirm_id": "conf_8a7b6c"
}

// User confirms via POST /api/v1/chat/confirm
{
  "confirm_id": "conf_8a7b6c",
  "confirmed": true
}

Model Selection Strategy

Query ComplexityCriteriaModel Selected
SimpleStatus checks, health, single-tool queriesLlama 3 8B / Mistral Medium
StandardMulti-step queries, analytics, bundle analysisClaude Sonnet / GPT-4o
ComplexForensics, multi-chain analysis, strategy evaluationClaude Opus / GPT-4-turbo
CodeTransaction decoding, contract analysisDeepSeek Coder / Claude Sonnet

Selection heuristics:

  • Token count -- short queries (fewer than 50 tokens) default to fast models.
  • Tool count -- queries likely requiring 3+ tools escalate to larger models.
  • Keyword detection -- "forensic", "analyze", "compare" trigger complex models.
  • Explicit override -- users can specify model in the request.
  • Fallback -- if the primary provider times out (30s), the next provider in the chain is used.

Tool Routing

User Query: "Show me relay stats and submit this bundle"
                    │
        ┌───────────▼────────────┐
        │    Tool Router         │
        │                        │
        │  1. Parse tool calls   │
        │  2. Check permissions  │
        │  3. Classify mutation  │
        └───┬────────────┬───────┘
            │            │
    ┌───────▼──┐   ┌─────▼──────┐
    │ Read-Only│   │  Mutating  │
    │          │   │            │
    │ relay_   │   │ bundle_    │
    │ stats    │   │ submit     │
    │          │   │            │
    │ Execute  │   │ HOLD for   │
    │ immediately│ │ confirmation│
    └──────────┘   └────────────┘

Read-only tools execute in parallel when the LLM requests multiple. Mutating tools are serialized and gated.

Deployment Components

ComponentContainerPortHealth CheckPurpose
Q AI APIq-ai-api9100GET /healthzMain API server
MCP Serverq-ai-mcp9101GET /healthMCP tool server
MEV Engineengine8080GET /healthMEV extraction engine
Gatewaygateway9099GET /healthWebSocket proxy
Redisredis6379redis-cli pingConversation cache
Prometheusprometheus9090GET /-/healthyMetrics collection
Grafanagrafana3000GET /api/healthDashboards

Docker Compose

version: "3.8"

services:
  q-ai-api:
    image: yoorquezt/q-ai:latest
    ports:
      - "9100:9100"
    environment:
      - Q_AI_PORT=9100
      - Q_AI_ENGINE_URL=http://engine:8080
      - Q_AI_GATEWAY_URL=ws://gateway:9099
      - Q_AI_MCP_URL=http://q-ai-mcp:9101
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - Q_AI_REDIS_URL=redis://redis:6379
      - Q_AI_LOG_LEVEL=info
    depends_on:
      engine:
        condition: service_healthy
      gateway:
        condition: service_healthy
      redis:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:9100/healthz"]
      interval: 10s
      timeout: 5s
      retries: 3

  q-ai-mcp:
    image: yoorquezt/q-ai-mcp:latest
    ports:
      - "9101:9101"
    environment:
      - MCP_PORT=9101
      - MCP_ENGINE_URL=http://engine:8080
      - MCP_LOG_LEVEL=info
    depends_on:
      engine:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:9101/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  engine:
    image: yoorquezt/mev-engine:latest
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:8080/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  gateway:
    image: yoorquezt/yqmev-gateway:latest
    ports:
      - "9099:9099"
    environment:
      - YQMEV_UPSTREAM_URL=http://engine:8080
    depends_on:
      engine:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:9099/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
# Start the full Q AI stack
docker compose -f docker-compose.q-ai.yaml up -d

# Check health
curl http://localhost:9100/healthz
Edit this page