Now in Public Beta

Ollama is open. Your AI stack shouldn't be.

Add OIDC auth, per-user quotas, and audit trails to any local LLM. Attach is the thin layer between your models and production.

Terminal

pip install attach-dev

Get Started Read the Docs

Zero lock-in

Attach is a lightweight proxy—not a platform. It sits in front of Ollama, vLLM, or any OpenAI-compatible server. Verify JWTs, enforce quotas, log everything. No code changes to your existing stack.

Identity that propagates

Attach stamps X-Attach-User on every request. Your downstream tools and agents see the same verified identity—no auth logic scattered across your stack.

Everything you need for secure LLM apps

Run it next to any model server and get secure, shareable context in under 1 minute.

OIDC / JWT Verification

Verifies OIDC/JWT or DID-JWT tokens and stamps X-Attach-User + X-Attach-Session headers so every downstream agent sees the same identity.

Agent-to-Agent Hand-off

Implements /a2a/tasks/send + /tasks/status for Google A2A & OpenHands hand-off between multiple agents in your workflow.

Pluggable Memory Backend

Mirrors prompts & responses to a memory backend (Weaviate Docker container by default) for persistent context and retrieval.

Workflow Traces

Built-in Temporal integration for workflow orchestration and tracing across your entire multi-agent pipeline.

OIDC / JWT / DID-JWTIdentity HeadersA2A ProtocolsMemory BackendTemporal Traces

Get started in 60 seconds

Secure, authenticated LLM endpoints on your local machine.

terminal

# Install the package
pip install attach-dev

# Start memory in Docker
docker run --rm -d -p 6666:8080 \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  semitechnologies/weaviate:1.30.5

# Export your token
export JWT="<paste Auth0 or DID token>"
export OIDC_ISSUER=https://YOUR_DOMAIN.auth0.com
export OIDC_AUD=ollama-local

# Run gateway
attach-gateway --port 8080 &

# Make a protected call
curl -H "Authorization: Bearer $JWT" \
     -d '{"model":"tinyllama","prompt":"hello"}' \
    http://localhost:8080/api/chat | jq .

View Design Docs

Prerequisites: Python 3.12, Ollama installed, Auth0 account or DID token

Join the community

Connect with developers building secure LLM applications. Get help, share ideas, and contribute.

Discord Community

Real-time discussions, support, and collaboration with other developers.

Join Discord

GitHub Repository

Contribute, report issues, and explore the source code.

View on GitHub

Documentation Examples