CrewAI makes multi-agent orchestration easy, but production deployments need auth and persistent memory. Attach Gateway adds both without changing your agent code.
Building production CrewAI applications? You'll quickly hit these walls:
Each agent request carries verified user identity. Know exactly who triggered which agent.
Weaviate-backed memory that persists across agent runs. Context follows users, not sessions.
A2A headers propagate identity across your entire crew. No token juggling required.
pip install attach-dev crewai
docker run --rm -d -p 6666:8080 \ -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \ semitechnologies/weaviate:1.30.5
export OIDC_ISSUER=https://your-domain.auth0.com export OIDC_AUD=crewai-app export MEM_BACKEND=weaviate attach-gateway --port 8080
from crewai import Agent, Crew
import os
# CrewAI uses Attach as the LLM endpoint
os.environ["OPENAI_API_BASE"] = "http://localhost:8080/v1"
os.environ["OPENAI_API_KEY"] = your_jwt_token
# Your agents now have auth + memory
researcher = Agent(
role="Researcher",
goal="Find relevant information",
# ... rest of config
)┌─────────────┐ ┌─────────────────┐ ┌─────────────┐
│ CrewAI │────▶│ Attach Gateway │────▶│ Ollama │
│ Agents │ │ (Auth+Memory) │ │ / vLLM │
└─────────────┘ └────────┬────────┘ └─────────────┘
│
▼
┌─────────────────┐
│ Weaviate │
│ (Vector Memory)│
└─────────────────┘Every agent request flows through Attach. User identity is verified, memory is read/written, and the request is forwarded to your LLM backend.
Each customer gets isolated memory and tracked usage. Bill per token consumed.
SSO integration ensures only authorized employees can trigger agent workflows.
When enabled, Attach stores conversation history in Weaviate (a vector database). Each user gets isolated memory—when User A's agent runs, it only sees User A's past conversations. Memory persists across sessions, so your agents remember context from previous runs.
Yes. When the first agent authenticates via Attach, subsequent agents in the crew inherit that identity through A2A (agent-to-agent) headers. Every agent action is attributed to the original authenticated user, maintaining a complete audit trail.
Attach applies rate limits at the user level, not the agent level. If a user runs multiple crews or agents, their combined token usage counts against their quota. This prevents users from bypassing limits by spawning multiple agents.
Currently, yes—Weaviate is the supported memory backend for vector storage. It's open-source and easy to run via Docker (just one command). Memory is optional—if you don't need persistence, you can run Attach without a memory backend.
Add enterprise auth and memory to CrewAI in minutes. No registration required.