Skip to main content

PRD: Agentic Chat Feature for PermitProof

Overview

This PRD describes the integration of an AI-powered agentic chat interface into the PermitProof web application. The chat will provide users with an intelligent assistant that can answer questions, retrieve information, and perform actions across the PermitProof platform using natural language.

Background

Current State

  • PermitProof has a comprehensive gRPC API exposed as REST endpoints:
    • Local Development: gRPC-Gateway proxy on http://localhost:8082
    • Production: ESPv2 API Gateway on Cloud Run (e.g., https://construction-code-expert-esp2-prod-xxx.run.app)
  • An experimental ArchitecturalPlanReviewAgent demonstrates ADK integration with OpenAPI toolset
  • ADK provides a reference chat UI implementation in its development environment
  • Users currently interact with the system through traditional UI components

Problem Statement

Users need to:

  • Quickly find information across multiple pages, files, and compliance reports
  • Get explanations of complex building code requirements
  • Navigate large architectural plan documents efficiently
  • Understand compliance issues without deep technical knowledge

Traditional UI navigation requires users to know exactly where to find information, resulting in:

  • Time wasted clicking through multiple screens
  • Difficulty discovering features
  • Steep learning curve for new users
  • Reduced productivity for power users

Goals

Primary Goals

  1. Natural Language Interface: Enable users to interact with PermitProof using conversational queries
  2. Intelligent Tool Usage: Agent should automatically select and use appropriate tools (API endpoints) to answer questions
  3. Context-Aware Responses: Maintain conversation history and project context
  4. Seamless Integration: Embed chat within existing Angular Material 3 UI without disrupting workflows

Secondary Goals

  1. Progressive Disclosure: Users discover PermitProof features through conversation
  2. Accessibility: Provide alternative interaction method for users with different needs
  3. Analytics: Track common user queries to improve product roadmap
  4. Mobile-Friendly: Support conversational interface on smaller screens

Non-Goals

  1. Voice Input: Not including speech-to-text in initial release
  2. Agent Training: Not allowing users to train custom agents
  3. Multi-Agent Orchestration: Single agent only (no agent-to-agent communication)
  4. Third-Party Integrations: Only PermitProof APIs in scope

Success Metrics

Quantitative Metrics

  • Adoption Rate: 40% of active users try chat within first month
  • Retention Rate: 60% of users who try chat use it again within 7 days
  • Task Completion: 70% of chat sessions result in successful task completion
  • Response Time: Average P95 response time < 3 seconds
  • API Success Rate: 95% of tool calls succeed

Qualitative Metrics

  • User Satisfaction: Net Promoter Score (NPS) > 40
  • Ease of Use: System Usability Scale (SUS) score > 70
  • User Feedback: Positive sentiment in 80% of feedback

User Stories

US-1: Document Navigation and Discovery

As a building inspector
I want to ask "Where in the project files is the electrical plan for the kitchen in 2-bedroom units?"
So that I can quickly locate specific plan sheets without manually browsing all pages

Acceptance Criteria:

  • Chat searches across all plan pages and metadata
  • Returns specific page numbers and sheet references
  • Shows thumbnail previews of matching pages
  • Provides direct links to view full page

US-2: Code Compliance Analysis

As a architect
I want to ask "Does the stairwell width on the second floor plan meet IBC requirements?"
So that I can verify compliance without manually looking up code sections

Acceptance Criteria:

  • Agent identifies relevant plan pages (e.g., second floor architectural)
  • Calls GetApplicableCodeSections API for stairwell requirements
  • Extracts actual dimensions from plan if available
  • Compares against IBC Chapter 10 requirements (minimum 44" for exit access)
  • Provides clear pass/fail assessment with code citations

US-3: Compliance Issue Discovery

As a building inspector
I want to ask "Show me all pages with fire egress violations in this residential project"
So that I can focus my review on critical safety compliance issues

Acceptance Criteria:

  • Agent calls GetPageComplianceReport for all pages
  • Filters for fire/life safety violations (IBC Chapter 10)
  • Returns list of pages with specific violations
  • Shows severity level (major/minor) for each issue
  • Provides one-click navigation to flagged pages

US-4: Technical Specification Extraction

As a permit coordinator
I want to ask "What is the building height and how many stories are shown in these plans?"
So that I can verify zoning compliance without manually reading all sheets

Acceptance Criteria:

  • Agent searches title blocks, elevation sheets, and site plans
  • Extracts building height in feet/meters
  • Counts number of stories from floor plans
  • Cites specific pages where information was found
  • Compares against typical zoning requirements if applicable

US-5: Code Research and Explanation

As a junior architect
I want to ask "Explain the occupancy classification requirements for a mixed-use building"
So that I can understand which code sections apply to my project

Acceptance Criteria:

  • Agent searches ICC IBC 2021 for occupancy classifications (Chapter 3)
  • Explains mixed-use building rules (IBC 508)
  • Provides examples of common mixed-use scenarios
  • Cites specific code sections with brief summaries
  • Offers to analyze current project plans if applicable

US-6: Cross-Reference Analysis

As a plan reviewer
I want to ask "Does the plumbing shown on the mechanical plans match what's on the architectural floor plans for the second floor?"
So that I can identify coordination issues between disciplines

Acceptance Criteria:

  • Agent identifies relevant pages (architectural vs mechanical)
  • Extracts plumbing fixture locations from both plan sets
  • Compares locations and flags discrepancies
  • Highlights potential conflicts or missing elements
  • Provides page references for each finding

Functional Requirements

FR-1: Chat Interface Components

FR-1.1: Chat Window

  • Floating FAB Button: Material 3 Extended FAB in bottom-right corner
  • Expandable Panel: Slides up from bottom on mobile, side panel on desktop
  • Context Bar: Sticky header showing current context (project, file, page, pinned pages)
  • Context Chips: Visual chips showing active elements with dismiss/pin actions
  • Message List: Scrollable list of user and agent messages
  • Input Field: Material 3 text field with send button
  • Typing Indicator: Animated indicator when agent is processing

FR-1.2: Message Rendering

  • User Messages: Right-aligned, colored bubble (primary color)
  • Agent Messages: Left-aligned, colored bubble (surface-variant)
  • Markdown Support: Render formatted text, lists, code blocks
  • Link Rendering: Clickable links to navigate to specific pages/sections
  • Image Display: Show thumbnails of plan pages inline
  • Thinking Tokens Display: Show agent's reasoning process in muted, collapsible section
  • Tool Call Progress: Real-time display of API calls in progress with loading indicators
  • Tool Call Results: Collapsible sections showing:
    • Tool name and operation ID
    • Input parameters (formatted JSON)
    • Response data (formatted, truncated if large)
    • Execution time
    • Success/error status
  • Progressive Rendering: Stream content as it arrives (thinking → tool calls → response)
  • Expandable Details: All technical details (thinking, tool calls) collapsed by default to save space

FR-1.3: Chat Controls

  • Clear History: Button to clear conversation (with confirmation)
  • New Session: Start fresh conversation
  • Export Chat: Download conversation as text/JSON
  • Settings: Configure agent behavior (temperature, verbosity)

FR-2: Agent Backend Architecture

FR-2.1: Agent Service (Java)

  • LlmAgent: ADK-based agent using Gemini 2.5 Flash or Pro with thinking enabled
  • OpenApiToolset Integration: Automatically load all tools from openapi.yaml
  • Session Management: Maintain conversation history per user/project
  • Context Injection: Include current project/page context in agent prompts
  • Thinking Mode: Enable Gemini's thinking tokens for transparent reasoning
  • Event Streaming: Stream thinking, tool calls, and responses separately for progressive UI updates

FR-2.2: Tool Execution

  • Automatic Tool Selection: Agent determines which APIs to call
  • Authentication Forwarding: Pass user's Firebase auth token to API calls
  • Error Handling: Gracefully handle API errors and retry logic
  • Rate Limiting: Prevent abuse with per-user rate limits

FR-2.3: Streaming Responses

Primary Transport: Server-Sent Events (SSE)

SSE is a web standard for server-to-client streaming over HTTP:

  • How It Works: Server pushes updates to client as text/event-stream
  • Browser API: JavaScript EventSource for consuming streams
  • One-Way: Server → Client (perfect for agent responses)
  • Auto-Reconnect: Browser handles connection drops
  • Firewall-Friendly: Uses standard HTTP, works through proxies

Implementation:

  • gRPC Side: ChatService uses server-side streaming (stream ChatMessageChunk)
  • REST Side: ESPv2 automatically transcodes gRPC stream → SSE
  • Frontend: Angular uses EventSource to receive events
  • Progress Updates: Stream thinking, tool calls, and responses as separate events

Example SSE Stream:

event: message
data: {"type":"THINKING","thinkingContent":"I need to search..."}

event: message
data: {"type":"TOOL_CALL_START","toolCall":{"toolName":"GetPlan"}}

event: message
data: {"type":"TEXT","content":"Based on my search..."}

event: message
data: {"type":"TEXT","content":" here is the result","isFinal":true}

Why SSE Instead of WebSocket?

  • Simpler implementation (just HTTP GET)
  • Agent responses are one-way (server → client)
  • User messages sent via separate POST requests
  • Works better with ESPv2 transcoding
  • Lower overhead for our use case

FR-3: API Endpoints

FR-3.1: Chat API (gRPC + REST via ESPv2)

gRPC Service: ChatService (defined in chat.proto)
REST Endpoints: Auto-generated by gRPC-Gateway annotations

POST /v1/chat/sessions/{sessionId}/stream
- Request: { "message": "user query", "context": {...} }
- Response: SSE stream (text/event-stream)
- event: message, data: {"type":"THINKING",...}
- event: message, data: {"type":"TOOL_CALL_START",...}
- event: message, data: {"type":"TEXT","content":"...",...}
- gRPC: rpc StreamChat() returns (stream ChatMessageChunk)

POST /v1/chat/sessions
- Request: { "projectId": "...", "metadata": {...} }
- Response: { "sessionId": "...", "createdAt": "..." }
- gRPC: rpc CreateSession(CreateSessionRequest) returns (SessionResponse)

GET /v1/chat/sessions/{sessionId}
- Response: { "messages": [...], "hasMore": false }
- gRPC: rpc GetSessionHistory(GetSessionHistoryRequest) returns (SessionHistoryResponse)

DELETE /v1/chat/sessions/{sessionId}
- Response: Empty (204 No Content)
- gRPC: rpc DeleteSession(DeleteSessionRequest) returns (google.protobuf.Empty)

Note: ESPv2 automatically transcodes:

  • gRPC server-side streaming → SSE (Server-Sent Events)
  • Unary RPCs → Regular REST requests

FR-3.2: Agent Management API

GET /v1/chat/agents
- Lists available agents (future: multiple specialized agents)

GET /v1/chat/agents/{agentId}/capabilities
- Returns list of tools/capabilities for agent

FR-4: Integration Points

FR-4.1: Context Awareness

  • Current Project: Auto-include active project ID in agent context
  • Current File: Include active file ID and name when viewing documents
  • Current Page: Include page number when viewing plan pages
  • Pinned Pages: Allow users to pin important pages for ongoing reference
  • Context Visibility: Display all active context elements in chat header
  • Context Management: Users can add/remove context items via chip actions
  • User Permissions: Agent respects RBAC permissions
  • Recent Activity: Include user's recent actions for better context

FR-4.2: Deep Linking

  • Navigate to Pages: [View Page 5](/projects/abc/pages/5)
  • Open Compliance Reports: Link to specific compliance findings
  • Search Results: Clickable search result links

FR-4.3: Notifications

  • Long-Running Tasks: Toast notification when background task completes
  • Errors: User-friendly error messages in chat
  • Suggestions: Proactive suggestions based on user's workflow

Non-Functional Requirements

NFR-1: Performance

  • First Response: < 2 seconds for simple queries (no tool calls)
  • Complex Queries: < 5 seconds for queries requiring multiple API calls
  • Streaming Latency: < 100ms token-to-token latency
  • Concurrent Users: Support 100 concurrent chat sessions

NFR-2: Reliability

  • Uptime: 99.5% availability during business hours
  • Error Rate: < 5% of queries result in errors
  • Graceful Degradation: If agent fails, provide helpful error message

NFR-3: Security

  • Authentication: All chat requests require Firebase authentication
  • Authorization: Agent respects user's RBAC permissions
  • Data Privacy: Chat logs comply with data retention policies
  • Input Sanitization: Prevent prompt injection attacks
  • API Key Security: Gemini API keys secured in Secret Manager

NFR-4: Scalability

  • Horizontal Scaling: Backend can scale to multiple instances
  • Session State: Store session state in Redis/Firestore
  • Stateless Agent: Agent instances are stateless for easy scaling

NFR-5: Usability

  • Mobile Responsive: Chat works on mobile screens (320px+)
  • Keyboard Navigation: Full keyboard support (Enter to send, etc.)
  • Screen Reader: ARIA labels for accessibility
  • Dark Mode: Respects user's theme preference

NFR-6: Observability

  • Query Logging: Log all user queries (sanitized)
  • Tool Usage Metrics: Track which tools are most used
  • Error Tracking: Detailed error logs in Cloud Logging
  • Performance Tracing: OpenTelemetry tracing for latency analysis

Technical Architecture

Deployment Architecture

✅ Pure gRPC Architecture (No Spring Boot, No Separate Backend)

This design integrates chat directly into your existing gRPC service on Cloud Run.

Key Points:

  1. Chat functionality is integrated into existing gRPC Cloud Run service (not a separate backend)
  2. New ChatService gRPC service added alongside existing services (ArchitecturalPlanService, etc.)
  3. Agent runs in-process within the same Cloud Run instance
  4. Agent uses OpenApiToolset to call other services via loopback through ESPv2
  5. ESPv2 proxy handles REST ↔ gRPC transcoding for all APIs (existing + chat)
  6. Zero new infrastructure - reuses your existing gRPC + ESPv2 + Cloud Run stack

What This Means:

  • No separate deployment for chat
  • No Spring Boot framework needed
  • No additional Cloud Run service costs
  • Agent and services run in the same JVM process
  • Same deployment process: ./cli/sdlc/full-stack-deploy.sh

High-Level Architecture

┌──────────────────────────────────────────────────────────────┐
│ Angular Frontend │
│ ┌────────────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Chat Component│ │ Chat Service│ │ SSE Client │ │
│ │ (Material 3) │←→│ (TypeScript)│←→│ (EventSource) │ │
│ └────────────────┘ └──────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────┬───────────────┘
│ HTTPS/SSE

┌──────────────────────────────────────────┐
│ ESPv2 Proxy (Cloud Run) │
│ - REST ↔ gRPC transcoding │
│ - Firebase auth validation │
│ - Serves /v1/chat/* endpoints │
└──────────────┬───────────────────────────┘
│ gRPC (Internal)

┌──────────────────────────────────────────────────────────────┐
│ gRPC Services (Cloud Run - Java) │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ NEW: ChatService (gRPC) │ │
│ │ - StreamChat(request) → stream ChatMessage │ │
│ │ - CreateSession() │ │
│ │ - GetSessionHistory() │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ ChatAgentService (Java) │ │
│ │ - InMemoryRunner with LlmAgent │ │
│ │ - Event streaming │ │
│ │ - Context injection │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ LlmAgent (ADK) │ │
│ │ - Model: Gemini 2.5 Flash w/ thinking │ │
│ │ - Tools: OpenApiToolset │ │
│ │ - System prompt with context │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ OpenApiToolset │ │
│ │ - Loads openapi.yaml │ │
│ │ - Creates REST tools │ │
│ │ - Calls ESPv2 endpoints │ │
│ └────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ┌────────────────▼─────────────────────────────────────┐ │
│ │ Existing gRPC Services: │ │
│ │ - ArchitecturalPlanService │ │
│ │ - ArchitecturalPlanReviewService │ │
│ │ - ComplianceCodeSearchService │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Firestore (via Admin SDK) │ │
│ │ - Chat sessions │ │
│ │ - Message history │ │
│ └──────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘

Agent calls its own APIs via loopback through ESPv2:
LlmAgent → OpenApiToolset → http://localhost:8080 or ESPv2 URL
→ gRPC Services (same process)

Component Responsibilities

Frontend Components

ChatComponent (web-ng-m3/src/app/components/chat/)

  • Material 3 UI components (FAB, panel, messages)
  • Message rendering (Markdown, links, images)
  • User input handling
  • SSE connection management

ChatService (web-ng-m3/src/app/services/chat.service.ts)

  • API client for chat endpoints
  • Session management
  • Message queue/buffering
  • State management (RxJS)

ChatMessageComponent (web-ng-m3/src/app/components/chat/message/)

  • Individual message bubble
  • Markdown rendering
  • Link click handling
  • Copy/share actions

Backend Components (gRPC Services)

ChatService (src/main/proto/chat.proto + ChatServiceImpl.java)

  • gRPC service definition with server-side streaming
  • REST endpoints via gRPC-Gateway annotations
  • Session creation/management RPCs
  • Message streaming RPC

ChatAgentService (src/main/java/.../service/ChatAgentService.java)

  • ADK agent initialization (singleton per Cloud Run instance)
  • OpenApiToolset loading from openapi.yaml
  • Context injection (project, file, page, pinned pages)
  • Tool execution coordination
  • Event streaming (thinking, tool calls, responses)

ChatSessionService (src/main/java/.../service/ChatSessionService.java)

  • Session persistence (Firestore via Admin SDK)
  • Conversation history management
  • Session expiration/cleanup
  • User authorization validation

OpenApiToolset (src/main/java/com/google/adk/tools/openapi/)

  • Already exists in your codebase!
  • Wraps each OpenAPI operation as an ADK tool
  • Makes HTTP calls to gRPC-Gateway or ESPv2
  • Handles request/response transformation

Data Models - Three Layers Explained

Important: There are THREE separate data model layers in this design:

LayerModelsPurposeLocation
1. ADK InternalSession, EventAgent conversation statecom.google.adk.* (already exists)
2. gRPC ProtoChatMessageChunk, ChatContextAPI wire protocolchat.proto (new)
3. TypeScriptChatSession, ChatMessageFrontend typeschat.models.ts (new)

Layer 1 (ADK) is reused - We don't redefine Session/Event
Layers 2 & 3 are new - They define our chat-specific API and UI models

Relationship Between Layers

Angular (TypeScript models)
↓ HTTP/SSE
gRPC Proto (ChatMessageChunk)
↓ ChatServiceImpl converts
ADK Internal (Session, Event) ← Managed by InMemoryRunner

Why Three Layers?

  • ADK models = Internal agent state (we don't control these)
  • Proto models = gRPC API contract (we define these)
  • TypeScript models = Frontend types (mirror proto models)

Summary: What We Reuse vs What We Create

✅ REUSE (ADK Already Provides):

  • com.google.adk.sessions.Session - Internal session management
  • com.google.adk.events.Event - Internal event structure
  • com.google.adk.runner.InMemoryRunner - Orchestrates agent execution
  • Session/Event management is handled automatically by ADK

🆕 CREATE (New Code):

  • chat.proto - gRPC service definition with our chat-specific messages
  • ChatServiceImpl.java - gRPC service implementation
  • ChatAgentService.java - Wraps InMemoryRunner, converts ADK Events to our models
  • ChatSession (Firestore) - UI metadata (projectId, pinnedPages, etc.)
  • ChatMessage (Firestore) - Simplified message history for display
  • TypeScript models - Frontend types (mirror the proto messages)

Important: The TypeScript models are NOT duplicating ADK classes - they're the frontend API contract!

Chat Session (Firestore Metadata)

This stores additional metadata beyond ADK Session:

/**
* Chat session metadata (NOT the same as ADK Session)
*
* ADK Session is managed internally by InMemoryRunner
* ChatSession adds PermitProof-specific fields
*/
interface ChatSession {
sessionId: string; // Maps to ADK Session.id
userId: string; // Maps to ADK Session.userId
projectId?: string; // NEW: PermitProof context
createdAt: Timestamp; // NEW: UI tracking
lastMessageAt: Timestamp; // NEW: UI tracking
messageCount: number; // NEW: UI counter
metadata: {
agentName: string; // Maps to ADK Session.appName
modelName: string; // NEW: Gemini model version
userAgent: string;
};
}

Chat Message

/**
* Simplified chat message (NOT the same as ADK Event)
*
* ADK Event has complex structure with Parts, FunctionCalls, etc.
* ChatMessage extracts the essentials for UI display
*/
interface ChatMessage {
messageId: string; // Maps to ADK Event.id
sessionId: string;
role: 'user' | 'agent' | 'system'; // Derived from Event.author
content: string; // Extracted from Event.content() Parts
timestamp: Timestamp; // Derived from Event.timestamp()
metadata?: {
thinkingContent?: string; // NEW: From thinking tokens
toolCalls?: ToolCall[]; // NEW: From FunctionCalls
tokens?: TokenUsage; // NEW: From Gemini response
latencyMs?: number; // NEW: Measured by us
};
}

Tool Call

interface ToolCall {
toolName: string;
operationId: string;
parameters: Record<string, any>;
result?: any;
error?: string;
durationMs: number;
}

API Tools Available to Agent

The agent will have access to all operations defined in openapi.yaml (auto-generated from proto files with gRPC-Gateway).

Important: Tools are loaded at agent initialization from the OpenAPI spec, meaning:

  • New proto service methods → regenerate openapi.yaml → restart Cloud Run
  • Agent automatically gets new tools without code changes to ChatAgentService

Architectural Plan APIs

  • ArchitecturalPlanService_ListArchitecturalPlanIds - List all plans
  • ArchitecturalPlanService_GetArchitecturalPlan - Get plan details
  • ArchitecturalPlanService_GetArchitecturalPlanPagePdf - Get page PDF
  • ArchitecturalPlanService_GetArchitecturalPlanPageMarkdown - Get page as markdown
  • ArchitecturalPlanService_GetArchitecturalPlanPageTranscript - Get OCR transcript
  • ArchitecturalPlanService_GetArchitecturalPlanPageExplanation - Get AI explanation

Plan Review APIs

  • ArchitecturalPlanReviewService_GetApplicableCodeSections - Get applicable ICC codes
  • ArchitecturalPlanReviewService_GetPageComplianceReport - Get compliance report
  • ArchitecturalPlanReviewService_GetPageInspectionChecklist - Get inspection checklist

Code Search APIs

  • ComplianceCodeSearchService_GetIccCodeSearchResults - Search ICC codes
  • ComplianceCodeSearchService_GetIccBookChapter - Get code chapter

Agent System Prompt

You are PermitProof Assistant, an AI expert in building code compliance and 
architectural plan review. You help users navigate construction permits,
analyze architectural plans, and understand building code requirements.

You have access to the following capabilities:
1. Retrieve and analyze architectural plans (PDFs, transcripts, explanations)
2. Identify applicable building code sections
3. Generate compliance reports
4. Search ICC building codes
5. Answer questions about building regulations

When a user asks a question:
1. Determine which tools are needed
2. Call the appropriate APIs with correct parameters
3. Synthesize results into a clear, professional response
4. Cite specific page numbers, code sections, and sources
5. Provide actionable recommendations when applicable

Context:
- Current Project: {{PROJECT_ID}}
- Current Page: {{PAGE_NUMBER}}
- User Role: {{USER_ROLE}}

Be concise, accurate, and helpful. If you're unsure, say so and suggest
alternative approaches.

User Interface Design

Desktop Layout

┌─────────────────────────────────────────────────────────┐
│ PermitProof - Project ABC │
├─────────────────────────────────────┬───────────────────┤
│ │ │
│ │ ╭───────────────╮│
│ Main Content Area │ │ 🤖 Chat ││
│ (Plan Viewer / Dashboard) │ ├───────────────┤│
│ │ │ 📌 Context: ││
│ │ │ [ABC ×] [P.5×]││
│ │ │ [E-2.1 ×] + ││
│ │ ├───────────────┤│
│ │ │ 👤 User: ││
│ │ │ Where is the ││
│ │ │ electrical... ││
│ │ ├───────────────┤│
│ │ │ 🤖 Agent: ││
│ │ │ ▼ 💭 Thinking ││
│ │ │ ▼ 🔧 Tool (2) ││
│ │ │ Based on my ││
│ │ │ search... ││
│ │ │ [Page 8] 📄 ││
│ │ ├───────────────┤│
│ │ │ 👤 User: ││
│ │ │ Does it meet..││
│ │ ├───────────────┤│
│ │ │ 🤖 ⚙️ Working...││
│ │ ╰───────────────╯│
│ │ ┌───────────────┐│
│ │ │ Ask question..││
│ │ └───────────────┘│
└─────────────────────────────────────┴───────────────────┘

Legend:
▼ = Collapsible section (click to expand/collapse)
[Item ×] = Context chip with remove action
+ = Add more context

Mobile Layout

┌───────────────────┐
│ PermitProof │
├───────────────────┤
│ │
│ Main Content │
│ │
│ │
│ │
│ │
│ [+] ← FAB │
└───────────────────┘

(Tap FAB to open chat panel from bottom)

┌───────────────────┐
│ PermitProof │
├───────────────────┤
│ User: Show page 5 │
├───────────────────┤
│ Agent: Here is... │
│ [Page 5 Link] │
├───────────────────┤
│ 💭 Thinking... │
├───────────────────┤
│ Ask a question... │
│ [>] │
└───────────────────┘

Example Conversations

Example 1: Simple Information Retrieval with Context

📌 Context: [Project: Oakmont ×] [File: Plans.pdf ×] [+]

User: Where is the electrical plan for the kitchen in unit 2B?

Agent: [💭 Thinking... (click to expand)]
└─ The user is asking about unit 2B's kitchen electrical. I can see
we're already viewing the Oakmont project plans, so I'll search
within this project's pages for electrical layouts.

[🔧 Calling API: ArchitecturalPlanService_ListArchitecturalPlanIds]
⚙️ Searching Oakmont project pages...
✓ Found 24 pages (120ms)

[🔧 Calling API: ArchitecturalPlanService_GetArchitecturalPlanPageTranscript]
⚙️ Analyzing page 8: "Second Floor - Electrical Plan"
✓ Completed (450ms)

Based on my search of the **Oakmont project** plans, the electrical plan for
unit 2B's kitchen is on:

📄 **Page 8** - "Second Floor Electrical Plan"
• Sheet E-2.1
• Shows kitchen outlets, lighting, and appliance circuits
• [View Page 8] [📌 Pin Page]

The kitchen has:
- (4) Counter outlets on dedicated 20A circuits
- Recessed lighting on dimmer switch
- Range outlet (240V/40A)
- Dishwasher and disposal on separate circuits

Would you like me to check if this meets code requirements?

Example 2: Multi-Step Query with Progressive Updates and Pinned Pages

📌 Context: [Project: Oakmont ×] [📌 Page 5: Floor Plan ×] [📌 Page 12: Elevations ×] [+]

User: Show me all fire egress violations in this project

Agent: [💭 Thinking... (click to expand)]
└─ I need to analyze all pages of the Oakmont project for fire egress
compliance issues. I can see the user has pinned pages 5 and 12, so
I'll pay special attention to those. I'll check each page's compliance
report and filter for IBC Chapter 10 (Means of Egress) violations.

[🔧 Processing 24 pages...]
✓ Page 1-6: No fire egress issues
⚙️ Page 7: Analyzing...
⚠️ Page 8: Found corridor width issue
✓ Page 9-11: No fire egress issues
⚠️ Page 12: Found exit signage issue
✓ Page 13-24: No fire egress issues

[📊 Results: 2 violations found]

┌─ [🔧 Page 8 Tool Call Details] (click to expand)
│ Tool: ArchitecturalPlanReviewService_GetPageComplianceReport
│ Parameters: { projectId: "proj_oak_001", pageNumber: 8 }
│ Duration: 1,240ms
│ Status: ✓ Success
└─

Found **2 fire egress violations** requiring attention:

📄 **Page 8** - Second Floor Plan
⚠️ **Major Violation**: Exit corridor width 42" (requires 44" per IBC 1020.2)
• Location: North corridor between Units 2A and 2B
• Required: 44 inches minimum
• Provided: 42 inches
• [View Page 8] [View Code Section]

📄 **Page 12** - Stairwell Detail
⚠️ **Minor Violation**: Missing exit signage locations
• Code: IBC 1013.1 (Exit Signs)
• Required: Every exit and exit access door
• Issue: No signage shown at north stairwell door
• [View Page 12] [View Code Section]

**Recommended Actions:**
1. Widen north corridor to 44" or request variance
2. Add exit signage symbols to plan at stairwell doors

💡 **Note**: I also reviewed your pinned pages (Page 5: Floor Plan, Page 12: Elevations)
and confirmed these are the primary locations with egress concerns.

Would you like me to generate a formal compliance report PDF?

[📌 Pin Page 8] - Add to context for follow-up questions

Example 3: Code Research with Thinking

User: What does IBC Section 1020.2 say about corridor widths?

Agent: [💭 Thinking... (click to expand)]
└─ The user is asking about a specific IBC code section (1020.2)
regarding corridor widths. I should search the ICC building code
database and provide the exact text along with context about
exceptions and measurement requirements.

[🔧 Calling API: ComplianceCodeSearchService_GetIccCodeSearchResults]
⚙️ Searching IBC 2021 Book (ID: 2217)
⚙️ Query: "Section 1020.2 corridor width"
✓ Found section (680ms)

**IBC Section 1020.2** - Exit Access Corridor Width

**Minimum Width**: Exit access corridors shall be not less than **44 inches
(1118 mm)** in width.

**Exceptions**:
1. Corridors serving an occupant load of less than 50 shall be not less
than **36 inches (914 mm)** in width
2. Corridors in Group I-2 occupancies shall comply with Section 407.3
3. Corridors in Group I-3 occupancies shall comply with Section 408.7

**Measurement**: Width shall be measured in the **clear** at the narrowest point,
meaning no obstructions (doors when open, handrails, etc.) can reduce the width
below the minimum.

**Related Sections**:
- IBC 1005: Egress width calculation
- IBC 1018.2: Corridor continuity
- IBC 1020.1: Corridor construction requirements

📖 **Source**: IBC 2021, Chapter 10 (Means of Egress)
[View Full Section in Code Book]

💡 **Common Issues**:
- Door swing reducing effective width
- Built-in cabinets or fixtures encroaching
- Recessed door frames not counted in clear width

Is there a specific corridor in your project you'd like me to analyze for
compliance with this section?

Implementation Phases

Phase 1: MVP (4 weeks)

Goal: Basic chat with read-only operations

Deliverables:

  • Angular chat component (Material 3)
  • ChatService gRPC implementation (chat.proto + ChatServiceImpl.java)
  • ADK agent with OpenApiToolset
  • Support for 5 core tools:
    • List plans
    • Get plan details
    • Get page PDF
    • Get applicable codes
    • Search ICC codes
  • SSE streaming responses
  • Basic session management (in-memory)

Success Criteria:

  • Users can ask questions and get responses
  • Agent successfully calls APIs
  • Response time < 3 seconds
  • Works on desktop and mobile

Phase 2: Enhanced UX (3 weeks)

Goal: Production-ready interface with persistence

Deliverables:

  • Firestore session persistence
  • All OpenAPI tools enabled (20+ operations)
  • Markdown rendering with syntax highlighting
  • Deep linking to pages/sections
  • Export chat history
  • Dark mode support
  • Typing indicators and progress states
  • Error retry logic

Success Criteria:

  • Session persistence across browser refreshes
  • All API tools working
  • Rich message formatting
  • Comprehensive error handling

Phase 3: Intelligence (3 weeks)

Goal: Smart context and proactive features

Deliverables:

  • Project context injection
  • Page context injection
  • Recent activity awareness
  • Suggested queries based on user's workflow
  • Tool usage analytics
  • Query rewriting for better results
  • Multi-turn conversation memory

Success Criteria:

  • Agent uses context effectively
  • Suggestions are relevant 70%+ of time
  • Improved task completion rate

Phase 4: Polish (2 weeks)

Goal: Production hardening

Deliverables:

  • Comprehensive error handling
  • Rate limiting
  • Performance optimization
  • Security audit
  • Accessibility audit (WCAG 2.1 AA)
  • User documentation
  • Admin dashboard for monitoring

Success Criteria:

  • Passes security review
  • WCAG 2.1 AA compliant
  • Performance meets NFRs
  • Ready for production deployment

Risks and Mitigations

Risk 1: LLM Hallucinations

Impact: High - Users receive incorrect information
Likelihood: Medium
Mitigation:

  • Always cite API sources in responses
  • Include disclaimer about AI-generated content
  • Implement confidence scoring
  • Allow users to report incorrect responses

Risk 2: API Rate Limits (Gemini)

Impact: High - Service degradation
Likelihood: Medium
Mitigation:

  • Implement exponential backoff
  • Queue requests during high load
  • Cache common queries
  • Monitor quota usage with alerts

Risk 3: Cost (Gemini API)

Impact: Medium - Unexpected bills
Likelihood: High
Mitigation:

  • Set monthly budget alerts
  • Implement per-user rate limits
  • Use Flash model for simple queries
  • Cache responses for common questions

Risk 4: Privacy/Compliance

Impact: High - Data leak or compliance violation
Likelihood: Low
Mitigation:

  • Sanitize logs (remove PII)
  • Implement data retention policies
  • Get legal review of chat logs
  • Allow users to delete chat history

Risk 5: Poor Tool Selection

Impact: Medium - Incorrect API calls
Likelihood: Medium
Mitigation:

  • Comprehensive tool descriptions in OpenAPI
  • Include examples in tool descriptions
  • Implement tool call validation
  • Monitor and tune agent prompts

Risk 6: User Confusion

Impact: Medium - Low adoption
Likelihood: Medium
Mitigation:

  • Onboarding tutorial
  • Example queries visible in UI
  • Clear limitations messaging
  • Escape hatch to traditional UI

Open Questions

  1. Model Selection: Use Gemini Flash (fast, cheap) or Pro (smart, expensive)?

    • Recommendation: Start with Flash, upgrade to Pro for complex queries
  2. Session Scope: Should chat be project-scoped or global?

    • Recommendation: Project-scoped by default, with option for global queries
  3. Tool Visibility: Show users which APIs were called?

    • Recommendation: Yes, in expandable "debug" section
  4. Multi-Modal: Support image uploads (plan markups)?

    • Recommendation: Phase 2+ feature
  5. Permissions: How to handle restricted data?

    • Recommendation: Agent respects RBAC, returns "unauthorized" for restricted tools
  6. Analytics: Track all queries or sample?

    • Recommendation: Log all queries (sanitized), sample detailed telemetry

Dependencies

Internal Dependencies

  • gRPC services running and healthy
  • gRPC-Gateway proxy operational
  • OpenAPI spec up-to-date with all operations
  • Firebase Authentication integrated
  • RBAC system functional

External Dependencies

  • Google Gemini API access
  • ADK Java libraries (latest version)
  • Angular Material 3
  • RxJS for reactive state
  • Markdown rendering library (e.g., marked.js)

Appendix

A: Agent Instruction Template

See section "Agent System Prompt" above.

B: Tool Description Guidelines

Each tool in OpenAPI should have:

  • Clear summary (1 sentence description)
  • Detailed description with:
    • What the tool does
    • When to use it
    • Example use cases
    • Important parameters
    • Expected response structure

Example:

/v1/architectural-plans/{architecturalPlanId}/pages/{pageNumber}/pdf:
post:
operationId: ArchitecturalPlanService_GetArchitecturalPlanPagePdf
summary: Retrieves the PDF document for a specific page
description: |
Gets the PDF file for a single page of an architectural plan.

Use this when:
- User wants to view the actual plan page
- Need to display visual content
- Preparing page for download

Parameters:
- architecturalPlanId: The plan's unique ID (e.g., "proj_oak_2024_001")
- pageNumber: Page number (1-indexed)

Returns:
- Binary PDF data (base64 encoded)
- Page metadata (dimensions, file size)

Performance: < 500ms for standard pages

C: Example Query Patterns

Document Navigation

  • "Where is the electrical riser diagram?"
  • "Show me all pages showing the kitchen layout"
  • "Find the foundation details for the south wall"
  • "Which sheet shows the HVAC layout for the second floor?"

Compliance Analysis

  • "Are there any fire egress violations in the second floor corridor?"
  • "Does the parking lot design meet ADA requirements?"
  • "Check if the stairwell meets IBC width requirements"
  • "What code sections apply to the mechanical room?"

Code Research

  • "What does IBC 1020.2 say about corridor widths?"
  • "Explain the occupancy load calculation for assembly spaces"
  • "What are the fire rating requirements for 2-hour walls?"
  • "Search for 'guardrail height' requirements in IBC 2021"

Technical Extraction

  • "What is the building height shown on the elevations?"
  • "How many parking spaces are provided on the site plan?"
  • "What's the occupancy classification for this building?"
  • "List all the room dimensions for unit 2B"

Cross-Discipline Coordination

  • "Do the structural columns match on architectural and structural sheets?"
  • "Compare plumbing fixture locations between arch and mech plans"
  • "Are all doors on floor plans shown in the door schedule?"
  • "Check if electrical outlets match between plans and specs"

D: Error Message Patterns

API Errors

Agent: I tried to retrieve that page, but encountered an error. 
The plan service is temporarily unavailable. Please try again in a moment.

[Technical Details] (collapsible)
Error: 503 Service Unavailable
Endpoint: GET /v1/architectural-plans/proj_123/pages/5/pdf

Permission Errors

Agent: I don't have permission to access that information. 
This might be because:
- The project is restricted to certain roles
- You don't have access to this project
- The page contains sensitive information

Please contact your project administrator if you think this is an error.

Not Found Errors

Agent: I couldn't find a project called "Oakmont Plaza". 

Did you mean one of these?
- Oakmont Residential (proj_oak_2024_001)
- Oak Plaza Commercial (proj_oakplz_2024_003)

Or try: "List all my projects"

E: Testing Strategy

Unit Tests

  • Tool selection logic
  • Parameter extraction from user queries
  • Error handling for API failures
  • Session management

Integration Tests

  • End-to-end chat flow
  • SSE streaming
  • Multi-tool workflows
  • Permission enforcement

User Acceptance Tests

  • Common query patterns
  • Mobile responsiveness
  • Accessibility (keyboard, screen reader)
  • Dark mode

Load Tests

  • 100 concurrent users
  • Long-running conversations
  • Large file handling
  • API timeout scenarios

F: Monitoring and Alerts

Metrics to Track

  • Query volume (queries/minute)
  • Response latency (P50, P95, P99)
  • Tool call success rate
  • Error rate by error type
  • Token usage (Gemini)
  • User session duration
  • Message count per session

Alerts

  • Error rate > 10% (P1 - page immediately)
  • P95 latency > 5s (P2 - investigate within 1 hour)
  • Gemini API quota at 80% (P3 - plan for scaling)
  • Session service unavailable (P1 - page immediately)

Dashboards

  • Real-time query monitoring
  • Tool usage heatmap
  • Error breakdown by type
  • Cost tracking (Gemini API)
  • User adoption funnel

Document Version: 1.0
Last Updated: 2024-10-24
Owner: Engineering Team
Reviewers: Product, Design, Security