Now in Public Beta

    Ollama is open. Your AI stack shouldn't be.

    Add OIDC auth, per-user quotas, and audit trails to any local LLM. Attach is the thin layer between your models and production.

    Terminal
    PyPI version
    pip install attach-dev

    Zero lock-in

    Attach is a lightweight proxy—not a platform. It sits in front of Ollama, vLLM, or any OpenAI-compatible server. Verify JWTs, enforce quotas, log everything. No code changes to your existing stack.

    Identity that propagates

    Attach stamps X-Attach-User on every request. Your downstream tools and agents see the same verified identity—no auth logic scattered across your stack.

    Everything you need for secure LLM apps

    Run it next to any model server and get secure, shareable context in under 1 minute.

    OIDC / JWT Verification

    Verifies OIDC/JWT or DID-JWT tokens and stamps X-Attach-User + X-Attach-Session headers so every downstream agent sees the same identity.

    Agent-to-Agent Hand-off

    Implements /a2a/tasks/send + /tasks/status for Google A2A & OpenHands hand-off between multiple agents in your workflow.

    Pluggable Memory Backend

    Mirrors prompts & responses to a memory backend (Weaviate Docker container by default) for persistent context and retrieval.

    Workflow Traces

    Built-in Temporal integration for workflow orchestration and tracing across your entire multi-agent pipeline.

    OIDC / JWT / DID-JWTIdentity HeadersA2A ProtocolsMemory BackendTemporal Traces

    Get started in 60 seconds

    Secure, authenticated LLM endpoints on your local machine.

    terminal
    # Install the package
    pip install attach-dev
    
    # Start memory in Docker
    docker run --rm -d -p 6666:8080 \
      -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
      semitechnologies/weaviate:1.30.5
    
    # Export your token
    export JWT="<paste Auth0 or DID token>"
    export OIDC_ISSUER=https://YOUR_DOMAIN.auth0.com
    export OIDC_AUD=ollama-local
    
    # Run gateway
    attach-gateway --port 8080 &
    
    # Make a protected call
    curl -H "Authorization: Bearer $JWT" \
         -d '{"model":"tinyllama","prompt":"hello"}' \
        http://localhost:8080/api/chat | jq .

    Prerequisites: Python 3.12, Ollama installed, Auth0 account or DID token

    Join the community

    Connect with developers building secure LLM applications. Get help, share ideas, and contribute.

    Discord Community

    Real-time discussions, support, and collaboration with other developers.

    Join Discord

    GitHub Repository

    Contribute, report issues, and explore the source code.

    View on GitHub