Production & Enterprise Deployment Guide π
Scaling CommitVigil to 1000+ users requires moving from a local monolithic setup to a distributed, high-availability architecture.
ποΈ Enterprise Architecture
For large-scale teams, we separate the Ingestion Layer (API) from the Processing Layer (Workers).
- API Layer: Stateless FastAPI pods behind a Load Balancer (AWS ALB / Nginx).
- Processing Layer: Distributed
arqworkers scaling horizontally based on queue depth. - State Layer: Managed Redis (AWS ElastiCache) and PostgreSQL (AWS RDS).
βΈοΈ Kubernetes Orchestration
We provide production-ready K8s manifests in infra/k8s/.
Key Scaling Features:
- Replica Sets: Run multiple API instances to handle high traffic.
- Auto-Scaling (HPA): Workers automatically scale from 2 to 20+ pods based on CPU usage or queue backlog.
- Resource Constraints: Strict CPU/Memory limits to prevent noisy-neighbor issues on shared clusters.
Deployment Command:
π Automated Org-Wide Ingestion
Managers don't need to manually ingest commits. Use GitHub Webhooks at the Organization level:
- Go to GitHub Organization Settings -> Webhooks.
- Add a new webhook:
- Payload URL:
https://api.commitvigil.yourcloud.com/api/v1/ingest/git - Content Type:
application/json - Secret: Your fixed
API_KEY_SECRET - Events: Select
Pushes.
- Payload URL:
Result: Every commit from every repo in your organization is automatically processed for accountability.
π Bulk Identity Sync (SSO)
Mapping 1000+ users via curl is inefficient. Implement a Mapping Sync Service:
- Active Directory / Google Workspace: Write a small cronjob that polls your employee list and calls:
POST /api/v1/users/config/slack(Email -> Slack ID)POST /api/v1/users/config/git(Email -> Git Email)- Automatic Discovery: CommitVigil can be configured to use the Slack
users.lookupByEmailAPI to auto-map users on their first commitment.
π Monitoring & Observability
Enterprise deployment includes Prometheus + Grafana integration:
- SLIs (Service Level Indicators):
background_jobs_pending: How many commitments are waiting for AI processing?ai_token_usage_total: Cost tracking per team/department.intervention_acceptance_rate: How often are managers acting on AI pings?
π° FinOps: Optimizing LLM Costs
At 1000+ users, token costs can add up. CommitVigil helps you optimize:
- Provider Switching: Use Groq (Llama 3.3 70b) for high-speed, low-cost extraction, and OpenAI (GPT-4o) only for complex tone adaptation.
- Mock Mode for Non-Critical Repos: Use
LLM_PROVIDER="mock"for internal experimental repositories to save budget. - Caching: CommitVigil can be extended with a vector cache (like RedisVL) to reuse extracted tasks from similar commit messages.
[!IMPORTANT] Always deploy in a Zero-Trust environment using encrypted secrets and private network VPCs for the Database and Redis.