The Information Tax was killing us. The ops team was drowning in "where do I find X?" questions. SOPs were scattered across Google Drive, Notion, and Slack threads. Every time a tiered support ticket came in, someone had to manually search for the right protocol.
The solution wasn't another wiki page. It was a Capital System: a retrieval engine that serves the exact answer instantly.
Here is how I architected and deployed a private Vertex AI Search capability to solve this.
1. Design & Infrastructure
We chose Google Cloud Vertex AI Search because it offers enterprise-grade retrieval out of the box without needing to manage our own vector database infrastructure initially. It respects IAM permissions and keeps data private to the VPC.
The Setup:
- GCP Project: Dedicated project for the search infrastructure to isolate billing and access.
- Data Sources: Connected Google Drive (for SOPs) and a dump of resolved Jira tickets (as CSVs) to the Vertex Data Store.
- Apps: Created a "Search" app and a "Chat" app within Vertex AI Agent Builder to test different retrieval modalities.
2. The System Architecture
The core of the Capital System is the retrieval loop. We didn't just want a search bar; we wanted answers delivered where the work happens (Slack).
The Workflow:
- Trigger: New question in the #ops-support Slack channel.
- Orchestration (Cloud Run): A lightweight Python service receives the webhook.
- Retrieval: The service calls the Vertex AI Search API with the user's query.
- Grounding: The LLM (Gemini via Vertex) summarizes the search results, citing the specific source documents.
- Response: The answer is posted back to the Slack thread with links to the original Drive docs.
Critical Feature: "Not in Corpus" Behavior. We tuned the system to explicitly say "I cannot find that in the SOPs" if the confidence score was low. This prevents hallucinations and flags gaps in our documentation.
3. Outcomes & Metrics
Deploying this Capital System had an immediate impact on the Information Tax.
- Search Time: Reduced from ~15 minutes per complex query to <10 seconds.
- Ticket Deflection: 40% of "Tier 1" questions were answered automatically by the bot without human intervention.
- Documentation Quality: The "Not in Corpus" errors created a prioritized backlog for the ops lead to write missing SOPs.
Need this for your team?
I can scope and deploy a Vertex AI Capital Systems Pilot on your data in ~1 week. Check out the pilot details here.