PRD: Agentic Chat Feature for PermitProof
Overview
This PRD describes the integration of an AI-powered agentic chat interface into the PermitProof web application. The chat will provide users with an intelligent assistant that can answer questions, retrieve information, and perform actions across the PermitProof platform using natural language.
Background
Current State
- PermitProof has a comprehensive gRPC API exposed as REST endpoints:
- Local Development: gRPC-Gateway proxy on
http://localhost:8082 - Production: ESPv2 API Gateway on Cloud Run (e.g.,
https://construction-code-expert-esp2-prod-xxx.run.app)
- Local Development: gRPC-Gateway proxy on
- An experimental
ArchitecturalPlanReviewAgentdemonstrates ADK integration with OpenAPI toolset - ADK provides a reference chat UI implementation in its development environment
- Users currently interact with the system through traditional UI components
Problem Statement
Users need to:
- Quickly find information across multiple pages, files, and compliance reports
- Get explanations of complex building code requirements
- Navigate large architectural plan documents efficiently
- Understand compliance issues without deep technical knowledge
Traditional UI navigation requires users to know exactly where to find information, resulting in:
- Time wasted clicking through multiple screens
- Difficulty discovering features
- Steep learning curve for new users
- Reduced productivity for power users
Goals
Primary Goals
- Natural Language Interface: Enable users to interact with PermitProof using conversational queries
- Intelligent Tool Usage: Agent should automatically select and use appropriate tools (API endpoints) to answer questions
- Context-Aware Responses: Maintain conversation history and project context
- Seamless Integration: Embed chat within existing Angular Material 3 UI without disrupting workflows
Secondary Goals
- Progressive Disclosure: Users discover PermitProof features through conversation
- Accessibility: Provide alternative interaction method for users with different needs
- Analytics: Track common user queries to improve product roadmap
- Mobile-Friendly: Support conversational interface on smaller screens
Non-Goals
- Voice Input: Not including speech-to-text in initial release
- Agent Training: Not allowing users to train custom agents
- Multi-Agent Orchestration: Single agent only (no agent-to-agent communication)
- Third-Party Integrations: Only PermitProof APIs in scope
Success Metrics
Quantitative Metrics
- Adoption Rate: 40% of active users try chat within first month
- Retention Rate: 60% of users who try chat use it again within 7 days
- Task Completion: 70% of chat sessions result in successful task completion
- Response Time: Average P95 response time < 3 seconds
- API Success Rate: 95% of tool calls succeed
Qualitative Metrics
- User Satisfaction: Net Promoter Score (NPS) > 40
- Ease of Use: System Usability Scale (SUS) score > 70
- User Feedback: Positive sentiment in 80% of feedback
User Stories
US-1: Document Navigation and Discovery
As a building inspector
I want to ask "Where in the project files is the electrical plan for the kitchen in 2-bedroom units?"
So that I can quickly locate specific plan sheets without manually browsing all pages
Acceptance Criteria:
- Chat searches across all plan pages and metadata
- Returns specific page numbers and sheet references
- Shows thumbnail previews of matching pages
- Provides direct links to view full page
US-2: Code Compliance Analysis
As a architect
I want to ask "Does the stairwell width on the second floor plan meet IBC requirements?"
So that I can verify compliance without manually looking up code sections
Acceptance Criteria:
- Agent identifies relevant plan pages (e.g., second floor architectural)
- Calls GetApplicableCodeSections API for stairwell requirements
- Extracts actual dimensions from plan if available
- Compares against IBC Chapter 10 requirements (minimum 44" for exit access)
- Provides clear pass/fail assessment with code citations
US-3: Compliance Issue Discovery
As a building inspector
I want to ask "Show me all pages with fire egress violations in this residential project"
So that I can focus my review on critical safety compliance issues
Acceptance Criteria:
- Agent calls GetPageComplianceReport for all pages
- Filters for fire/life safety violations (IBC Chapter 10)
- Returns list of pages with specific violations
- Shows severity level (major/minor) for each issue
- Provides one-click navigation to flagged pages
US-4: Technical Specification Extraction
As a permit coordinator
I want to ask "What is the building height and how many stories are shown in these plans?"
So that I can verify zoning compliance without manually reading all sheets
Acceptance Criteria:
- Agent searches title blocks, elevation sheets, and site plans
- Extracts building height in feet/meters
- Counts number of stories from floor plans
- Cites specific pages where information was found
- Compares against typical zoning requirements if applicable
US-5: Code Research and Explanation
As a junior architect
I want to ask "Explain the occupancy classification requirements for a mixed-use building"
So that I can understand which code sections apply to my project
Acceptance Criteria:
- Agent searches ICC IBC 2021 for occupancy classifications (Chapter 3)
- Explains mixed-use building rules (IBC 508)
- Provides examples of common mixed-use scenarios
- Cites specific code sections with brief summaries
- Offers to analyze current project plans if applicable
US-6: Cross-Reference Analysis
As a plan reviewer
I want to ask "Does the plumbing shown on the mechanical plans match what's on the architectural floor plans for the second floor?"
So that I can identify coordination issues between disciplines
Acceptance Criteria:
- Agent identifies relevant pages (architectural vs mechanical)
- Extracts plumbing fixture locations from both plan sets
- Compares locations and flags discrepancies
- Highlights potential conflicts or missing elements
- Provides page references for each finding
Functional Requirements
FR-1: Chat Interface Components
FR-1.1: Chat Window
- Floating FAB Button: Material 3 Extended FAB in bottom-right corner
- Expandable Panel: Slides up from bottom on mobile, side panel on desktop
- Context Bar: Sticky header showing current context (project, file, page, pinned pages)
- Context Chips: Visual chips showing active elements with dismiss/pin actions
- Message List: Scrollable list of user and agent messages
- Input Field: Material 3 text field with send button
- Typing Indicator: Animated indicator when agent is processing
FR-1.2: Message Rendering
- User Messages: Right-aligned, colored bubble (primary color)
- Agent Messages: Left-aligned, colored bubble (surface-variant)
- Markdown Support: Render formatted text, lists, code blocks
- Link Rendering: Clickable links to navigate to specific pages/sections
- Image Display: Show thumbnails of plan pages inline
- Thinking Tokens Display: Show agent's reasoning process in muted, collapsible section
- Tool Call Progress: Real-time display of API calls in progress with loading indicators
- Tool Call Results: Collapsible sections showing:
- Tool name and operation ID
- Input parameters (formatted JSON)
- Response data (formatted, truncated if large)
- Execution time
- Success/error status
- Progressive Rendering: Stream content as it arrives (thinking → tool calls → response)
- Expandable Details: All technical details (thinking, tool calls) collapsed by default to save space
FR-1.3: Chat Controls
- Clear History: Button to clear conversation (with confirmation)
- New Session: Start fresh conversation
- Export Chat: Download conversation as text/JSON
- Settings: Configure agent behavior (temperature, verbosity)
FR-2: Agent Backend Architecture
FR-2.1: Agent Service (Java)
- LlmAgent: ADK-based agent using Gemini 2.5 Flash or Pro with thinking enabled
- OpenApiToolset Integration: Automatically load all tools from
openapi.yaml - Session Management: Maintain conversation history per user/project
- Context Injection: Include current project/page context in agent prompts
- Thinking Mode: Enable Gemini's thinking tokens for transparent reasoning
- Event Streaming: Stream thinking, tool calls, and responses separately for progressive UI updates
FR-2.2: Tool Execution
- Automatic Tool Selection: Agent determines which APIs to call
- Authentication Forwarding: Pass user's Firebase auth token to API calls
- Error Handling: Gracefully handle API errors and retry logic
- Rate Limiting: Prevent abuse with per-user rate limits
FR-2.3: Streaming Responses
Primary Transport: Server-Sent Events (SSE)
SSE is a web standard for server-to-client streaming over HTTP:
- How It Works: Server pushes updates to client as
text/event-stream - Browser API: JavaScript
EventSourcefor consuming streams - One-Way: Server → Client (perfect for agent responses)
- Auto-Reconnect: Browser handles connection drops
- Firewall-Friendly: Uses standard HTTP, works through proxies
Implementation:
- gRPC Side: ChatService uses server-side streaming (
stream ChatMessageChunk) - REST Side: ESPv2 automatically transcodes gRPC stream → SSE
- Frontend: Angular uses
EventSourceto receive events - Progress Updates: Stream thinking, tool calls, and responses as separate events
Example SSE Stream:
event: message
data: {"type":"THINKING","thinkingContent":"I need to search..."}
event: message
data: {"type":"TOOL_CALL_START","toolCall":{"toolName":"GetPlan"}}
event: message
data: {"type":"TEXT","content":"Based on my search..."}
event: message
data: {"type":"TEXT","content":" here is the result","isFinal":true}
Why SSE Instead of WebSocket?
- Simpler implementation (just HTTP GET)
- Agent responses are one-way (server → client)
- User messages sent via separate POST requests
- Works better with ESPv2 transcoding
- Lower overhead for our use case
FR-3: API Endpoints
FR-3.1: Chat API (gRPC + REST via ESPv2)
gRPC Service: ChatService (defined in chat.proto)
REST Endpoints: Auto-generated by gRPC-Gateway annotations
POST /v1/chat/sessions/{sessionId}/stream
- Request: { "message": "user query", "context": {...} }
- Response: SSE stream (text/event-stream)
- event: message, data: {"type":"THINKING",...}
- event: message, data: {"type":"TOOL_CALL_START",...}
- event: message, data: {"type":"TEXT","content":"...",...}
- gRPC: rpc StreamChat() returns (stream ChatMessageChunk)
POST /v1/chat/sessions
- Request: { "projectId": "...", "metadata": {...} }
- Response: { "sessionId": "...", "createdAt": "..." }
- gRPC: rpc CreateSession(CreateSessionRequest) returns (SessionResponse)
GET /v1/chat/sessions/{sessionId}
- Response: { "messages": [...], "hasMore": false }
- gRPC: rpc GetSessionHistory(GetSessionHistoryRequest) returns (SessionHistoryResponse)
DELETE /v1/chat/sessions/{sessionId}
- Response: Empty (204 No Content)
- gRPC: rpc DeleteSession(DeleteSessionRequest) returns (google.protobuf.Empty)
Note: ESPv2 automatically transcodes:
- gRPC server-side streaming → SSE (Server-Sent Events)
- Unary RPCs → Regular REST requests
FR-3.2: Agent Management API
GET /v1/chat/agents
- Lists available agents (future: multiple specialized agents)
GET /v1/chat/agents/{agentId}/capabilities
- Returns list of tools/capabilities for agent
FR-4: Integration Points
FR-4.1: Context Awareness
- Current Project: Auto-include active project ID in agent context
- Current File: Include active file ID and name when viewing documents
- Current Page: Include page number when viewing plan pages
- Pinned Pages: Allow users to pin important pages for ongoing reference
- Context Visibility: Display all active context elements in chat header
- Context Management: Users can add/remove context items via chip actions
- User Permissions: Agent respects RBAC permissions
- Recent Activity: Include user's recent actions for better context
FR-4.2: Deep Linking
- Navigate to Pages:
[View Page 5](/projects/abc/pages/5) - Open Compliance Reports: Link to specific compliance findings
- Search Results: Clickable search result links
FR-4.3: Notifications
- Long-Running Tasks: Toast notification when background task completes
- Errors: User-friendly error messages in chat
- Suggestions: Proactive suggestions based on user's workflow
Non-Functional Requirements
NFR-1: Performance
- First Response: < 2 seconds for simple queries (no tool calls)
- Complex Queries: < 5 seconds for queries requiring multiple API calls
- Streaming Latency: < 100ms token-to-token latency
- Concurrent Users: Support 100 concurrent chat sessions
NFR-2: Reliability
- Uptime: 99.5% availability during business hours
- Error Rate: < 5% of queries result in errors
- Graceful Degradation: If agent fails, provide helpful error message
NFR-3: Security
- Authentication: All chat requests require Firebase authentication
- Authorization: Agent respects user's RBAC permissions
- Data Privacy: Chat logs comply with data retention policies
- Input Sanitization: Prevent prompt injection attacks
- API Key Security: Gemini API keys secured in Secret Manager
NFR-4: Scalability
- Horizontal Scaling: Backend can scale to multiple instances
- Session State: Store session state in Redis/Firestore
- Stateless Agent: Agent instances are stateless for easy scaling
NFR-5: Usability
- Mobile Responsive: Chat works on mobile screens (320px+)
- Keyboard Navigation: Full keyboard support (Enter to send, etc.)
- Screen Reader: ARIA labels for accessibility
- Dark Mode: Respects user's theme preference
NFR-6: Observability
- Query Logging: Log all user queries (sanitized)
- Tool Usage Metrics: Track which tools are most used
- Error Tracking: Detailed error logs in Cloud Logging
- Performance Tracing: OpenTelemetry tracing for latency analysis
Technical Architecture
Deployment Architecture
✅ Pure gRPC Architecture (No Spring Boot, No Separate Backend)
This design integrates chat directly into your existing gRPC service on Cloud Run.
Key Points:
- Chat functionality is integrated into existing gRPC Cloud Run service (not a separate backend)
- New
ChatServicegRPC service added alongside existing services (ArchitecturalPlanService, etc.) - Agent runs in-process within the same Cloud Run instance
- Agent uses OpenApiToolset to call other services via loopback through ESPv2
- ESPv2 proxy handles REST ↔ gRPC transcoding for all APIs (existing + chat)
- Zero new infrastructure - reuses your existing gRPC + ESPv2 + Cloud Run stack
What This Means:
- No separate deployment for chat
- No Spring Boot framework needed
- No additional Cloud Run service costs
- Agent and services run in the same JVM process
- Same deployment process:
./cli/sdlc/full-stack-deploy.sh
High-Level Architecture
┌──────────────────────────────────────────────────────────────┐
│ Angular Frontend │
│ ┌────────────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Chat Component│ │ Chat Service│ │ SSE Client │ │
│ │ (Material 3) │←→│ (TypeScript)│←→│ (EventSource) │ │
│ └────────────────┘ └──────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────┬───────────────┘
│ HTTPS/SSE
▼
┌──────────────────────────────────────────┐
│ ESPv2 Proxy (Cloud Run) │
│ - REST ↔ gRPC transcoding │
│ - Firebase auth validation │
│ - Serves /v1/chat/* endpoints │
└──────────────┬───────────────────────────┘
│ gRPC (Internal)
▼
┌──────────────────────────────────────────────────────────────┐
│ gRPC Services (Cloud Run - Java) │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ NEW: ChatService (gRPC) │ │
│ │ - StreamChat(request) → stream ChatMessage │ │
│ │ - CreateSession() │ │
│ │ - GetSessionHistory() │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ ChatAgentService (Java) │ │
│ │ - InMemoryRunner with LlmAgent │ │
│ │ - Event streaming │ │
│ │ - Context injection │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ LlmAgent (ADK) │ │
│ │ - Model: Gemini 2.5 Flash w/ thinking │ │
│ │ - Tools: OpenApiToolset │ │
│ │ - System prompt with context │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ OpenApiToolset │ │
│ │ - Loads openapi.yaml │ │
│ │ - Creates REST tools │ │
│ │ - Calls ESPv2 endpoints │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ┌────────────────▼─────────────────────────────────────┐ │
│ │ Existing gRPC Services: │ │
│ │ - ArchitecturalPlanService │ │
│ │ - ArchitecturalPlanReviewService │ │
│ │ - ComplianceCodeSearchService │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Firestore (via Admin SDK) │ │
│ │ - Chat sessions │ │
│ │ - Message history │ │
│ └──────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Agent calls its own APIs via loopback through ESPv2:
LlmAgent → OpenApiToolset → http://localhost:8080 or ESPv2 URL
→ gRPC Services (same process)
Component Responsibilities
Frontend Components
ChatComponent (web-ng-m3/src/app/components/chat/)
- Material 3 UI components (FAB, panel, messages)
- Message rendering (Markdown, links, images)
- User input handling
- SSE connection management
ChatService (web-ng-m3/src/app/services/chat.service.ts)
- API client for chat endpoints
- Session management
- Message queue/buffering
- State management (RxJS)
ChatMessageComponent (web-ng-m3/src/app/components/chat/message/)
- Individual message bubble
- Markdown rendering
- Link click handling
- Copy/share actions
Backend Components (gRPC Services)
ChatService (src/main/proto/chat.proto + ChatServiceImpl.java)
- gRPC service definition with server-side streaming
- REST endpoints via gRPC-Gateway annotations
- Session creation/management RPCs
- Message streaming RPC
ChatAgentService (src/main/java/.../service/ChatAgentService.java)
- ADK agent initialization (singleton per Cloud Run instance)
- OpenApiToolset loading from openapi.yaml
- Context injection (project, file, page, pinned pages)
- Tool execution coordination
- Event streaming (thinking, tool calls, responses)
ChatSessionService (src/main/java/.../service/ChatSessionService.java)
- Session persistence (Firestore via Admin SDK)
- Conversation history management
- Session expiration/cleanup
- User authorization validation
OpenApiToolset (src/main/java/com/google/adk/tools/openapi/)
- Already exists in your codebase!
- Wraps each OpenAPI operation as an ADK tool
- Makes HTTP calls to gRPC-Gateway or ESPv2
- Handles request/response transformation
Data Models - Three Layers Explained
Important: There are THREE separate data model layers in this design:
| Layer | Models | Purpose | Location |
|---|---|---|---|
| 1. ADK Internal | Session, Event | Agent conversation state | com.google.adk.* (already exists) |
| 2. gRPC Proto | ChatMessageChunk, ChatContext | API wire protocol | chat.proto (new) |
| 3. TypeScript | ChatSession, ChatMessage | Frontend types | chat.models.ts (new) |
Layer 1 (ADK) is reused - We don't redefine Session/Event
Layers 2 & 3 are new - They define our chat-specific API and UI models
Relationship Between Layers
Angular (TypeScript models)
↓ HTTP/SSE
gRPC Proto (ChatMessageChunk)
↓ ChatServiceImpl converts
ADK Internal (Session, Event) ← Managed by InMemoryRunner
Why Three Layers?
- ADK models = Internal agent state (we don't control these)
- Proto models = gRPC API contract (we define these)
- TypeScript models = Frontend types (mirror proto models)
Summary: What We Reuse vs What We Create
✅ REUSE (ADK Already Provides):
com.google.adk.sessions.Session- Internal session managementcom.google.adk.events.Event- Internal event structurecom.google.adk.runner.InMemoryRunner- Orchestrates agent execution- Session/Event management is handled automatically by ADK
🆕 CREATE (New Code):
chat.proto- gRPC service definition with our chat-specific messagesChatServiceImpl.java- gRPC service implementationChatAgentService.java- Wraps InMemoryRunner, converts ADK Events to our modelsChatSession(Firestore) - UI metadata (projectId, pinnedPages, etc.)ChatMessage(Firestore) - Simplified message history for display- TypeScript models - Frontend types (mirror the proto messages)
Important: The TypeScript models are NOT duplicating ADK classes - they're the frontend API contract!
Chat Session (Firestore Metadata)
This stores additional metadata beyond ADK Session:
/**
* Chat session metadata (NOT the same as ADK Session)
*
* ADK Session is managed internally by InMemoryRunner
* ChatSession adds PermitProof-specific fields
*/
interface ChatSession {
sessionId: string; // Maps to ADK Session.id
userId: string; // Maps to ADK Session.userId
projectId?: string; // NEW: PermitProof context
createdAt: Timestamp; // NEW: UI tracking
lastMessageAt: Timestamp; // NEW: UI tracking
messageCount: number; // NEW: UI counter
metadata: {
agentName: string; // Maps to ADK Session.appName
modelName: string; // NEW: Gemini model version
userAgent: string;
};
}
Chat Message
/**
* Simplified chat message (NOT the same as ADK Event)
*
* ADK Event has complex structure with Parts, FunctionCalls, etc.
* ChatMessage extracts the essentials for UI display
*/
interface ChatMessage {
messageId: string; // Maps to ADK Event.id
sessionId: string;
role: 'user' | 'agent' | 'system'; // Derived from Event.author
content: string; // Extracted from Event.content() Parts
timestamp: Timestamp; // Derived from Event.timestamp()
metadata?: {
thinkingContent?: string; // NEW: From thinking tokens
toolCalls?: ToolCall[]; // NEW: From FunctionCalls
tokens?: TokenUsage; // NEW: From Gemini response
latencyMs?: number; // NEW: Measured by us
};
}
Tool Call
interface ToolCall {
toolName: string;
operationId: string;
parameters: Record<string, any>;
result?: any;
error?: string;
durationMs: number;
}
API Tools Available to Agent
The agent will have access to all operations defined in openapi.yaml (auto-generated from proto files with gRPC-Gateway).
Important: Tools are loaded at agent initialization from the OpenAPI spec, meaning:
- New proto service methods → regenerate openapi.yaml → restart Cloud Run
- Agent automatically gets new tools without code changes to ChatAgentService
Architectural Plan APIs
ArchitecturalPlanService_ListArchitecturalPlanIds- List all plansArchitecturalPlanService_GetArchitecturalPlan- Get plan detailsArchitecturalPlanService_GetArchitecturalPlanPagePdf- Get page PDFArchitecturalPlanService_GetArchitecturalPlanPageMarkdown- Get page as markdownArchitecturalPlanService_GetArchitecturalPlanPageTranscript- Get OCR transcriptArchitecturalPlanService_GetArchitecturalPlanPageExplanation- Get AI explanation
Plan Review APIs
ArchitecturalPlanReviewService_GetApplicableCodeSections- Get applicable ICC codesArchitecturalPlanReviewService_GetPageComplianceReport- Get compliance reportArchitecturalPlanReviewService_GetPageInspectionChecklist- Get inspection checklist
Code Search APIs
ComplianceCodeSearchService_GetIccCodeSearchResults- Search ICC codesComplianceCodeSearchService_GetIccBookChapter- Get code chapter
Agent System Prompt
You are PermitProof Assistant, an AI expert in building code compliance and
architectural plan review. You help users navigate construction permits,
analyze architectural plans, and understand building code requirements.
You have access to the following capabilities:
1. Retrieve and analyze architectural plans (PDFs, transcripts, explanations)
2. Identify applicable building code sections
3. Generate compliance reports
4. Search ICC building codes
5. Answer questions about building regulations
When a user asks a question:
1. Determine which tools are needed
2. Call the appropriate APIs with correct parameters
3. Synthesize results into a clear, professional response
4. Cite specific page numbers, code sections, and sources
5. Provide actionable recommendations when applicable
Context:
- Current Project: {{PROJECT_ID}}
- Current Page: {{PAGE_NUMBER}}
- User Role: {{USER_ROLE}}
Be concise, accurate, and helpful. If you're unsure, say so and suggest
alternative approaches.
User Interface Design
Desktop Layout
┌─────────────────────────────────────────────────────────┐
│ PermitProof - Project ABC │
├─────────────────────────────────────┬───────────────────┤
│ │ │
│ │ ╭───────────────╮│
│ Main Content Area │ │ 🤖 Chat ││
│ (Plan Viewer / Dashboard) │ ├───────────────┤│
│ │ │ 📌 Context: ││
│ │ │ [ABC ×] [P.5×]││
│ │ │ [E-2.1 ×] + ││
│ │ ├───────────────┤│
│ │ │ 👤 User: ││
│ │ │ Where is the ││
│ │ │ electrical... ││
│ │ ├───────────────┤│
│ │ │ 🤖 Agent: ││
│ │ │ ▼ 💭 Thinking ││
│ │ │ ▼ 🔧 Tool (2) ││
│ │ │ Based on my ││
│ │ │ search... ││
│ │ │ [Page 8] 📄 ││
│ │ ├───────────────┤│
│ │ │ 👤 User: ││
│ │ │ Does it meet..││
│ │ ├───────────────┤│
│ │ │ 🤖 ⚙️ Working...││
│ │ ╰───────────────╯│
│ │ ┌───────────────┐│
│ │ │ Ask question..││
│ │ └───────────────┘│
└─────────────────────────────────────┴───────────────────┘
Legend:
▼ = Collapsible section (click to expand/collapse)
[Item ×] = Context chip with remove action
+ = Add more context
Mobile Layout
┌───────────────────┐
│ PermitProof │
├───────────────────┤
│ │
│ Main Content │
│ │
│ │
│ │
│ │
│ [+] ← FAB │
└───────────────────┘
(Tap FAB to open chat panel from bottom)
┌───────────────────┐
│ PermitProof │
├───────────────────┤
│ User: Show page 5 │
├───────────────────┤
│ Agent: Here is... │
│ [Page 5 Link] │
├───────────────────┤
│ 💭 Thinking... │
├───────────────────┤
│ Ask a question... │
│ [>] │
└───────────────────┘
Example Conversations
Example 1: Simple Information Retrieval with Context
📌 Context: [Project: Oakmont ×] [File: Plans.pdf ×] [+]
User: Where is the electrical plan for the kitchen in unit 2B?
Agent: [💭 Thinking... (click to expand)]
└─ The user is asking about unit 2B's kitchen electrical. I can see
we're already viewing the Oakmont project plans, so I'll search
within this project's pages for electrical layouts.
[🔧 Calling API: ArchitecturalPlanService_ListArchitecturalPlanIds]
⚙️ Searching Oakmont project pages...
✓ Found 24 pages (120ms)
[🔧 Calling API: ArchitecturalPlanService_GetArchitecturalPlanPageTranscript]
⚙️ Analyzing page 8: "Second Floor - Electrical Plan"
✓ Completed (450ms)
Based on my search of the **Oakmont project** plans, the electrical plan for
unit 2B's kitchen is on:
📄 **Page 8** - "Second Floor Electrical Plan"
• Sheet E-2.1
• Shows kitchen outlets, lighting, and appliance circuits
• [View Page 8] [📌 Pin Page]
The kitchen has:
- (4) Counter outlets on dedicated 20A circuits
- Recessed lighting on dimmer switch
- Range outlet (240V/40A)
- Dishwasher and disposal on separate circuits
Would you like me to check if this meets code requirements?
Example 2: Multi-Step Query with Progressive Updates and Pinned Pages
📌 Context: [Project: Oakmont ×] [📌 Page 5: Floor Plan ×] [📌 Page 12: Elevations ×] [+]
User: Show me all fire egress violations in this project
Agent: [💭 Thinking... (click to expand)]
└─ I need to analyze all pages of the Oakmont project for fire egress
compliance issues. I can see the user has pinned pages 5 and 12, so
I'll pay special attention to those. I'll check each page's compliance
report and filter for IBC Chapter 10 (Means of Egress) violations.
[🔧 Processing 24 pages...]
✓ Page 1-6: No fire egress issues
⚙️ Page 7: Analyzing...
⚠️ Page 8: Found corridor width issue
✓ Page 9-11: No fire egress issues
⚠️ Page 12: Found exit signage issue
✓ Page 13-24: No fire egress issues
[📊 Results: 2 violations found]
┌─ [🔧 Page 8 Tool Call Details] (click to expand)
│ Tool: ArchitecturalPlanReviewService_GetPageComplianceReport
│ Parameters: { projectId: "proj_oak_001", pageNumber: 8 }
│ Duration: 1,240ms
│ Status: ✓ Success
└─
Found **2 fire egress violations** requiring attention:
📄 **Page 8** - Second Floor Plan
⚠️ **Major Violation**: Exit corridor width 42" (requires 44" per IBC 1020.2)
• Location: North corridor between Units 2A and 2B
• Required: 44 inches minimum
• Provided: 42 inches
• [View Page 8] [View Code Section]
📄 **Page 12** - Stairwell Detail
⚠️ **Minor Violation**: Missing exit signage locations
• Code: IBC 1013.1 (Exit Signs)
• Required: Every exit and exit access door
• Issue: No signage shown at north stairwell door
• [View Page 12] [View Code Section]
**Recommended Actions:**
1. Widen north corridor to 44" or request variance
2. Add exit signage symbols to plan at stairwell doors
💡 **Note**: I also reviewed your pinned pages (Page 5: Floor Plan, Page 12: Elevations)
and confirmed these are the primary locations with egress concerns.
Would you like me to generate a formal compliance report PDF?
[📌 Pin Page 8] - Add to context for follow-up questions
Example 3: Code Research with Thinking
User: What does IBC Section 1020.2 say about corridor widths?
Agent: [💭 Thinking... (click to expand)]
└─ The user is asking about a specific IBC code section (1020.2)
regarding corridor widths. I should search the ICC building code
database and provide the exact text along with context about
exceptions and measurement requirements.
[🔧 Calling API: ComplianceCodeSearchService_GetIccCodeSearchResults]
⚙️ Searching IBC 2021 Book (ID: 2217)
⚙️ Query: "Section 1020.2 corridor width"
✓ Found section (680ms)
**IBC Section 1020.2** - Exit Access Corridor Width
**Minimum Width**: Exit access corridors shall be not less than **44 inches
(1118 mm)** in width.
**Exceptions**:
1. Corridors serving an occupant load of less than 50 shall be not less
than **36 inches (914 mm)** in width
2. Corridors in Group I-2 occupancies shall comply with Section 407.3
3. Corridors in Group I-3 occupancies shall comply with Section 408.7
**Measurement**: Width shall be measured in the **clear** at the narrowest point,
meaning no obstructions (doors when open, handrails, etc.) can reduce the width
below the minimum.
**Related Sections**:
- IBC 1005: Egress width calculation
- IBC 1018.2: Corridor continuity
- IBC 1020.1: Corridor construction requirements
📖 **Source**: IBC 2021, Chapter 10 (Means of Egress)
[View Full Section in Code Book]
💡 **Common Issues**:
- Door swing reducing effective width
- Built-in cabinets or fixtures encroaching
- Recessed door frames not counted in clear width
Is there a specific corridor in your project you'd like me to analyze for
compliance with this section?
Implementation Phases
Phase 1: MVP (4 weeks)
Goal: Basic chat with read-only operations
Deliverables:
- Angular chat component (Material 3)
- ChatService gRPC implementation (chat.proto + ChatServiceImpl.java)
- ADK agent with OpenApiToolset
- Support for 5 core tools:
- List plans
- Get plan details
- Get page PDF
- Get applicable codes
- Search ICC codes
- SSE streaming responses
- Basic session management (in-memory)
Success Criteria:
- Users can ask questions and get responses
- Agent successfully calls APIs
- Response time < 3 seconds
- Works on desktop and mobile
Phase 2: Enhanced UX (3 weeks)
Goal: Production-ready interface with persistence
Deliverables:
- Firestore session persistence
- All OpenAPI tools enabled (20+ operations)
- Markdown rendering with syntax highlighting
- Deep linking to pages/sections
- Export chat history
- Dark mode support
- Typing indicators and progress states
- Error retry logic
Success Criteria:
- Session persistence across browser refreshes
- All API tools working
- Rich message formatting
- Comprehensive error handling
Phase 3: Intelligence (3 weeks)
Goal: Smart context and proactive features
Deliverables:
- Project context injection
- Page context injection
- Recent activity awareness
- Suggested queries based on user's workflow
- Tool usage analytics
- Query rewriting for better results
- Multi-turn conversation memory
Success Criteria:
- Agent uses context effectively
- Suggestions are relevant 70%+ of time
- Improved task completion rate
Phase 4: Polish (2 weeks)
Goal: Production hardening
Deliverables:
- Comprehensive error handling
- Rate limiting
- Performance optimization
- Security audit
- Accessibility audit (WCAG 2.1 AA)
- User documentation
- Admin dashboard for monitoring
Success Criteria:
- Passes security review
- WCAG 2.1 AA compliant
- Performance meets NFRs
- Ready for production deployment
Risks and Mitigations
Risk 1: LLM Hallucinations
Impact: High - Users receive incorrect information
Likelihood: Medium
Mitigation:
- Always cite API sources in responses
- Include disclaimer about AI-generated content
- Implement confidence scoring
- Allow users to report incorrect responses
Risk 2: API Rate Limits (Gemini)
Impact: High - Service degradation
Likelihood: Medium
Mitigation:
- Implement exponential backoff
- Queue requests during high load
- Cache common queries
- Monitor quota usage with alerts
Risk 3: Cost (Gemini API)
Impact: Medium - Unexpected bills
Likelihood: High
Mitigation:
- Set monthly budget alerts
- Implement per-user rate limits
- Use Flash model for simple queries
- Cache responses for common questions
Risk 4: Privacy/Compliance
Impact: High - Data leak or compliance violation
Likelihood: Low
Mitigation:
- Sanitize logs (remove PII)
- Implement data retention policies
- Get legal review of chat logs
- Allow users to delete chat history
Risk 5: Poor Tool Selection
Impact: Medium - Incorrect API calls
Likelihood: Medium
Mitigation:
- Comprehensive tool descriptions in OpenAPI
- Include examples in tool descriptions
- Implement tool call validation
- Monitor and tune agent prompts
Risk 6: User Confusion
Impact: Medium - Low adoption
Likelihood: Medium
Mitigation:
- Onboarding tutorial
- Example queries visible in UI
- Clear limitations messaging
- Escape hatch to traditional UI
Open Questions
-
Model Selection: Use Gemini Flash (fast, cheap) or Pro (smart, expensive)?
- Recommendation: Start with Flash, upgrade to Pro for complex queries
-
Session Scope: Should chat be project-scoped or global?
- Recommendation: Project-scoped by default, with option for global queries
-
Tool Visibility: Show users which APIs were called?
- Recommendation: Yes, in expandable "debug" section
-
Multi-Modal: Support image uploads (plan markups)?
- Recommendation: Phase 2+ feature
-
Permissions: How to handle restricted data?
- Recommendation: Agent respects RBAC, returns "unauthorized" for restricted tools
-
Analytics: Track all queries or sample?
- Recommendation: Log all queries (sanitized), sample detailed telemetry
Dependencies
Internal Dependencies
- gRPC services running and healthy
- gRPC-Gateway proxy operational
- OpenAPI spec up-to-date with all operations
- Firebase Authentication integrated
- RBAC system functional
External Dependencies
- Google Gemini API access
- ADK Java libraries (latest version)
- Angular Material 3
- RxJS for reactive state
- Markdown rendering library (e.g., marked.js)
Appendix
A: Agent Instruction Template
See section "Agent System Prompt" above.
B: Tool Description Guidelines
Each tool in OpenAPI should have:
- Clear
summary(1 sentence description) - Detailed
descriptionwith:- What the tool does
- When to use it
- Example use cases
- Important parameters
- Expected response structure
Example:
/v1/architectural-plans/{architecturalPlanId}/pages/{pageNumber}/pdf:
post:
operationId: ArchitecturalPlanService_GetArchitecturalPlanPagePdf
summary: Retrieves the PDF document for a specific page
description: |
Gets the PDF file for a single page of an architectural plan.
Use this when:
- User wants to view the actual plan page
- Need to display visual content
- Preparing page for download
Parameters:
- architecturalPlanId: The plan's unique ID (e.g., "proj_oak_2024_001")
- pageNumber: Page number (1-indexed)
Returns:
- Binary PDF data (base64 encoded)
- Page metadata (dimensions, file size)
Performance: < 500ms for standard pages
C: Example Query Patterns
Document Navigation
- "Where is the electrical riser diagram?"
- "Show me all pages showing the kitchen layout"
- "Find the foundation details for the south wall"
- "Which sheet shows the HVAC layout for the second floor?"
Compliance Analysis
- "Are there any fire egress violations in the second floor corridor?"
- "Does the parking lot design meet ADA requirements?"
- "Check if the stairwell meets IBC width requirements"
- "What code sections apply to the mechanical room?"
Code Research
- "What does IBC 1020.2 say about corridor widths?"
- "Explain the occupancy load calculation for assembly spaces"
- "What are the fire rating requirements for 2-hour walls?"
- "Search for 'guardrail height' requirements in IBC 2021"
Technical Extraction
- "What is the building height shown on the elevations?"
- "How many parking spaces are provided on the site plan?"
- "What's the occupancy classification for this building?"
- "List all the room dimensions for unit 2B"
Cross-Discipline Coordination
- "Do the structural columns match on architectural and structural sheets?"
- "Compare plumbing fixture locations between arch and mech plans"
- "Are all doors on floor plans shown in the door schedule?"
- "Check if electrical outlets match between plans and specs"
D: Error Message Patterns
API Errors
Agent: I tried to retrieve that page, but encountered an error.
The plan service is temporarily unavailable. Please try again in a moment.
[Technical Details] (collapsible)
Error: 503 Service Unavailable
Endpoint: GET /v1/architectural-plans/proj_123/pages/5/pdf
Permission Errors
Agent: I don't have permission to access that information.
This might be because:
- The project is restricted to certain roles
- You don't have access to this project
- The page contains sensitive information
Please contact your project administrator if you think this is an error.
Not Found Errors
Agent: I couldn't find a project called "Oakmont Plaza".
Did you mean one of these?
- Oakmont Residential (proj_oak_2024_001)
- Oak Plaza Commercial (proj_oakplz_2024_003)
Or try: "List all my projects"
E: Testing Strategy
Unit Tests
- Tool selection logic
- Parameter extraction from user queries
- Error handling for API failures
- Session management
Integration Tests
- End-to-end chat flow
- SSE streaming
- Multi-tool workflows
- Permission enforcement
User Acceptance Tests
- Common query patterns
- Mobile responsiveness
- Accessibility (keyboard, screen reader)
- Dark mode
Load Tests
- 100 concurrent users
- Long-running conversations
- Large file handling
- API timeout scenarios
F: Monitoring and Alerts
Metrics to Track
- Query volume (queries/minute)
- Response latency (P50, P95, P99)
- Tool call success rate
- Error rate by error type
- Token usage (Gemini)
- User session duration
- Message count per session
Alerts
- Error rate > 10% (P1 - page immediately)
- P95 latency > 5s (P2 - investigate within 1 hour)
- Gemini API quota at 80% (P3 - plan for scaling)
- Session service unavailable (P1 - page immediately)
Dashboards
- Real-time query monitoring
- Tool usage heatmap
- Error breakdown by type
- Cost tracking (Gemini API)
- User adoption funnel
Document Version: 1.0
Last Updated: 2024-10-24
Owner: Engineering Team
Reviewers: Product, Design, Security