About Attach Gateway

    The missing identity & memory sidecar for LLM engines and multi-agent frameworks.

    Attach Gateway sits between your apps and model servers to solve two gaps every AI stack has: who is this request from and what context should it carry. It adds SSO authentication, agent-to-agent hand-off, and pluggable memory—without changing your model server.

    Why This Matters

    Most LLM engines (local or cloud) ship with no authentication by default.

    Multi-agent protocols assume a bearer token exists, but don't say how to issue or verify it.

    Teams end up with ad-hoc reverse proxies, leaked ports, and copy-pasted JWT code everywhere.

    Our Solution

    Attach Gateway is a resource server for AI: it validates identity, stamps headers for downstream engines, and mediates reads/writes to your memory store.

    Run it next to any HTTP model API and get secure, shareable context in minutes.

    How It Works

    1

    Client exchanges for token (SSO/service account)

    2

    Attach validates and adds identity headers

    3

    Request forwarded to model server with context

    Built on First Principles

    Engineered like a seat-belt for LLM stacks—always on, invisible until you need it, replaceable without ripping out the car.

    Local-First, Cloud-Optional

    Runs offline on your laptop, VM, or container. Cloud hosting is a choice, not a requirement.

    Zero-Trust Defaults

    Every downstream endpoint is untrusted until Attach stamps verified identity headers.

    Memory Outside the Model

    Context lives in a pluggable storage bus you control. Hot-swap backends without code changes.

    Protocol Over Implementation

    Works with OIDC/DID, agent-to-agent hand-off, and emerging AI protocols.

    Stateless & Observable

    JWT is the state. Tracing and metrics are built-in for real-time auditing.

    Privacy by Design

    No prompt bodies logged by default. Memory writes require signed calls. We never train on your data.

    Comprehensive Feature Set

    Modular Authentication

    Support for multiple identity providers, configurable via environment variables.

    Auth0
    Descope
    Okta
    Custom OIDC
    DID-JWT
    Service Accounts

    Pluggable Memory

    Hot-swappable memory backends for context storage and retrieval.

    Weaviate
    Postgres
    SQLite
    Files
    Mem0 (coming soon)
    Zep (coming soon)

    Built-in Monitoring

    Comprehensive observability and token usage tracking out of the box.

    Prometheus metrics
    OpenMeter integration
    OpenTelemetry tracing
    Custom dashboards

    Works With Everything

    LLM Engines & APIs

    Ollama
    vLLM
    LM Studio
    OpenAI APIs
    HTTP micro-agents
    REST APIs

    Agent Protocols

    Google A2A
    MCP
    OpenHands
    Temporal workflows
    Custom protocols
    Multi-agent frameworks

    Configuration Made Simple

    Environment-Driven Configuration

    Everything configurable via .env variables. No complex YAML files or UI configuration needed.

    Authentication

    • AUTH_BACKEND=auth0
    • OIDC_ISSUER=...
    • OIDC_AUD=...

    Memory

    • MEM_BACKEND=weaviate
    • WEAVIATE_URL=...
    • MAX_TOKENS_PER_MIN=...

    Monitoring

    • USAGE_METERING=prometheus
    • OPENMETER_API_KEY=...
    • LOG_LEVEL=info

    Why It's Safe

    Run Fully Offline

    Local SSO or pre-issued tokens plus local memory. No internet required.

    You Own Storage

    Attach mediates access to your data stores. We never see or store your content.

    Auditable by Default

    Every request has a trace ID. Built-in metrics for identity flows and memory access.

    Frequently Asked Questions

    Do you store my data?

    No. You own storage; Attach mediates access. We never train on your data or store prompts.

    Can I run fully offline?

    Yes—local SSO or pre-issued tokens plus local memory. Perfect for air-gapped environments.

    How is this different from an API gateway?

    It's identity + memory semantics purpose-built for LLMs and agents, not generic HTTP routing.

    What about performance?

    Stateless design scales horizontally. JWT validation is fast, memory lookups are optional and async.

    Our Vision

    We're building the infrastructure layer that AI applications actually need in production. No enterprise sales calls, no "book a demo" nonsense. Everything we build is open source, and we share the real stories—including the failures and dead ends.

    If you're building with local LLMs, agent frameworks, or just trying to add authentication to your AI stack without losing your mind, Attach Gateway is for you.

    Ready to Secure Your LLM Stack?

    Get started with Attach Gateway in minutes. No registration required.