Skip to main content

TDD: Architectural Plan Page Explanation & Educational Details

πŸ“‹ Product Requirements: Plan Page Explanation PRD
πŸ“‹ Implementation Issue: Issue #258 - AI-Powered Plan Page Explanation with Agentic Workflow

Overview​

This Technical Design Document details the implementation of AI-powered page explanation generation that transforms raw LLM-extracted text and PDF images into comprehensive, professional explanation markdown. The system uses an agentic multi-turn workflow (ReAct pattern) to iteratively refine explanations through self-reflection.

Architecture Overview​

System Components​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend (Angular) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ PageViewerComponent β”‚ β”‚
β”‚ β”‚ β”œβ”€ [Overview] [Preview] [Compliance] [Details] ← NEW β”‚ β”‚
β”‚ β”‚ └─ DetailsTabComponent ← NEW β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ gRPC-Web β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β–Ό Backend (Java/Spring) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ArchitecturalPlanService (Facade) β”‚ β”‚
β”‚ β”‚ └─ getArchitecturalPlanPage(...) ← Enhanced β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ PageExplanation β”‚ β”‚ Agentic Workflow β”‚ β”‚
β”‚ β”‚ Service (NEW) β”‚ β”‚ Engine (NEW) β”‚ β”‚
β”‚ β”‚ β”œβ”€ generate() β”‚ β”‚ β”œβ”€ AgenticPage β”‚ β”‚
β”‚ β”‚ β”œβ”€ get() β”‚ β”‚ β”‚ Interpreter β”‚ β”‚
β”‚ β”‚ └─ regenerate() β”‚ β”‚ └─ IterativeRefiner β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ LLM Integration Layer β”‚ β”‚
β”‚ β”‚ (Vertex AI - Gemini Models) β”‚ β”‚
β”‚ β”‚ + Prompt Caching Manager β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Cloud Storage (GCS) β”‚
β”‚ projects/{projectId}/files/{file_id}/pages/{pageNumber}/ β”‚
β”‚ β”œβ”€β”€ page.pdf ← INPUT (cached) β”‚
β”‚ β”œβ”€β”€ page.md ← INPUT β”‚
β”‚ β”œβ”€β”€ page-explanation.md ← OUTPUT (NEW) β”‚
β”‚ └── metadata.json ← UPDATED β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow: Page Understanding Generation​

User/System Request
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PageExplanationService β”‚
β”‚ .generatePageExplanation(...) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Load Page Context β”‚
β”‚ β”œβ”€ Read page.md β”‚
β”‚ β”œβ”€ Read page.pdf β”‚
β”‚ └─ Read metadata.json β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ AgenticPageInterpreter β”‚
β”‚ .explainPage(pageContext) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ TURN 1: Generate β”‚ β”‚ Prompt Caching β”‚
β”‚ Initial Draft │◄────────────── Manager β”‚
β”‚ β”‚ β”‚ (Cache PDF) β”‚
β”‚ Input: β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ - page.pdf (img) β”‚
β”‚ - page.md (text) β”‚
β”‚ - generation promptβ”‚
β”‚ β”‚
β”‚ Output: β”‚
β”‚ - draft-v1.md β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ TURN 2: Reflect β”‚
β”‚ on Draft Quality β”‚
β”‚ β”‚
β”‚ Input: β”‚
β”‚ - draft-v1.md β”‚
β”‚ - reflection promptβ”‚
β”‚ β”‚
β”‚ Output: β”‚
β”‚ - reflection.json β”‚
β”‚ (gaps, issues) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ TURN 3: Refine β”‚
β”‚ with Reflection β”‚
β”‚ β”‚
β”‚ Input: β”‚
β”‚ - page.pdf (cached)β”‚
β”‚ - draft-v1.md β”‚
β”‚ - reflection.json β”‚
β”‚ - refinement promptβ”‚
β”‚ β”‚
β”‚ Output: β”‚
β”‚ - page-understandingβ”‚
β”‚ .md (FINAL) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Optional: Additionalβ”‚
β”‚ Iterations β”‚
β”‚ (If max_iter > 3) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Save Artifacts β”‚
β”‚ β”œβ”€ Write page-explanation.md β”‚
β”‚ β”œβ”€ Update metadata.json β”‚
β”‚ └─ Log generation metrics β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Proto Message Definitions​

Import Existing Cost Analysis​

import "cost_analysis.proto";
import "google/protobuf/timestamp.proto";

New Messages for Page Interpretation​

syntax = "proto3";

package codetricks.construction.api;

import "google/protobuf/timestamp.proto";
import "cost_analysis.proto";

// ============================================================================
// PAGE INTERPRETATION - Request/Response Messages
// ============================================================================

// Request to generate AI-powered professional explanation for a plan page
message GeneratePageExplanationRequest {
// Project identification
string project_id = 1;
string file_id = 2;
int32 page_number = 3;

// Processing options
bool force_regenerate = 4; // Regenerate even if already exists
int32 max_phases_completed = 5; // Max agentic turns (default: 3)
bool verbose_logging = 6; // Log prompts and responses

// Model configuration (optional, uses defaults if not specified)
// Multi-model strategy enables cost optimization:
// - Use premium models (Gemini Pro) for quality-critical tasks (Generation, Refinement)
// - Use efficient models (Gemini Flash) for analytical tasks (Reflection, Scoring)
// - Gemini Flash is 50-100x cheaper than Pro with minimal quality impact for reflection
string primary_model = 7; // Primary model for generation/refinement (default: "gemini-2.5-pro-latest")
string reflection_model = 8; // Model for reflection/analysis (default: same as primary, or "gemini-2.0-flash-exp" for 50% cost savings)
bool enable_caching = 9; // Use prompt caching (default: true)

// Advanced: Per-turn model override for experimentation
map<string, string> turn_models = 10; // Turn type β†’ model (e.g., {"REFLECT": "gemini-flash", "GENERATE": "gemini-pro"})
}

message GeneratePageExplanationResponse {
bool success = 1;
string status_message = 2;

// Metadata about generated explanation
PageExplanationMetadata metadata = 3;

// Performance metrics (reuses existing CostAnalysisMetadata)
CostAnalysisMetadata cost_analysis = 4;
int32 processing_time_seconds = 5;

// Error details (if success = false)
string error_code = 6;
string error_details = 7;
}

// Metadata about page explanation generation
message PageExplanationMetadata {
// Generation status
string status = 1; // "pending", "processing", "completed", "failed"
google.protobuf.Timestamp generated_at = 2;

// Model tracking (multi-model support for cost optimization)
// Phase 1: Single model (primary_model only)
// Phase 2: Multi-model (different models for different turn types)
string primary_model = 3; // Main model used (e.g., "gemini-2.5-pro-latest")
map<string, int32> models_used = 4; // Model β†’ turn count (e.g., {"gemini-2.5-pro": 2, "gemini-flash": 1})

// Workflow tracking
int32 iterations_completed = 5; // Number of complete loop cycles
int32 total_turns = 6; // Total LLM API calls

// Cost analysis (reuses existing CostAnalysisMetadata)
// Provides comprehensive token tracking:
// - Total tokens and estimated cost
// - Detailed breakdown (non-cached input, cached content, output)
// - Rate per million tokens
// - Discount percentages for cached content
// - Processing metadata (duration, caching efficiency)
CostAnalysisMetadata cost_analysis = 5;

// Output file
string file_path = 6; // Relative path: "page-explanation.md"

// Quality metrics (optional, for evaluation)
float quality_score = 7; // 0.0-1.0 (future: human evaluation)
}

Note: This reuses the existing CostAnalysisMetadata message from cost_analysis.proto (Issue #176) which is already used for task cost tracking. This ensures:

  • Consistency: Same cost tracking across all LLM operations
  • Richness: Comprehensive token breakdown with caching metrics
  • Integration: Works with existing Firestore task tracking
  • No Duplication: Avoids creating redundant proto messages

Multi-Model Support: The MetaCostAnalysis message (also in cost_analysis.proto) supports per-model cost breakdown for workflows that use multiple models:

message MetaCostAnalysis {
map<string, CostAnalysisMetadata> model_costs = 1; // Per-model breakdown
double total_cost_usd = 2; // Aggregated total
int32 total_tokens = 3;
string primary_model = 4;
}

This is perfect for multi-model workflows where:

  • Turn 1 (Generate): Uses other premium models β†’ tracked separately
  • Turn 2 (Reflect): Uses other efficient models β†’ tracked separately
  • Turn 3 (Refine): Uses other premium models β†’ tracked separately
  • Final: Aggregated cost shows total across all models
// ============================================================================
// EXISTING MESSAGE UPDATES
// ============================================================================

// Update to existing ArchitecturalPlanPage message
message ArchitecturalPlanPage {
// ... existing fields (pageNumber, fileId, title, summary, etc.) ...

// NEW: Rich understanding content
string explanation_markdown = 20; // Content from page-explanation.md
PageExplanationMetadata explanation_metadata = 21;

// Indicates if understanding is available
bool has_understanding = 22;
}

// Update to existing GetArchitecturalPlanPageRequest
message GetArchitecturalPlanPageRequest {
string project_id = 1;
string file_id = 2;
int32 page_number = 3;

// NEW: Control whether to include understanding
bool include_understanding = 4; // Default: true
}

New gRPC Service Methods​

service ArchitecturalPlanService {
// ... existing methods ...

// NEW: Generate page explanation
rpc GeneratePageExplanation(GeneratePageExplanationRequest)
returns (GeneratePageExplanationResponse);

// NEW: Get page explanation status
rpc GetPageExplanationStatus(GetPageExplanationStatusRequest)
returns (PageExplanationMetadata);

// NEW: Batch generate for multiple pages
rpc BatchGeneratePageExplanation(BatchGeneratePageExplanationRequest)
returns (stream GeneratePageExplanationResponse);
}

message GetPageExplanationStatusRequest {
string project_id = 1;
string file_id = 2;
int32 page_number = 3;
}

message BatchGeneratePageExplanationRequest {
string project_id = 1;
string file_id = 2;
repeated int32 page_numbers = 3; // Pages to process

GeneratePageExplanationRequest options = 4; // Shared options
}

Multi-Model Strategy (Detailed)​

Model Selection by Task Type​

The agentic workflow performs different types of tasks, each with different requirements and optimal model choices:

Task TypeTurn(s)RequirementsBest ModelsCost/TurnQuality Impact
Generation1Creative writing, comprehensive coverage, professional toneGemini 2.5 Pro, other premium models, GPT-4~$0.02Critical - use premium
Reflection2, 4, 6...Analytical assessment, gap identification, scoringGemini Flash, other efficient models~$0.0004Minimal - use efficient
Refinement3, 5, 7...Creative improvement, tone consistency, gap fillingGemini 2.5 Pro (same as Turn 1)~$0.02Critical - use premium
OrchestrationN/AQuality threshold checks, iteration decisionsGemini Flash~$0.0001None - use fastest

Cost Optimization Examples​

Example 1: Single-Model (Phase 1 Implementation)​

PageExplanationConfig config = PageExplanationConfig.builder()
.primaryModel("gemini-2.5-pro")
.reflectionModel(null) // Use primary for all turns
.maxIterations(1)
.build();

// Iteration 1:
// - Turn 1 (Generate): Gemini Pro β†’ $0.019
// - Turn 2 (Reflect): Gemini Pro β†’ $0.003
// - Turn 3 (Refine): Gemini Pro β†’ $0.019
// Total: $0.04/page

Example 2: Multi-Model (Phase 2 Optimization)​

PageExplanationConfig config = PageExplanationConfig.builder()
.primaryModel("gemini-2.5-pro") // For Generate/Refine
.reflectionModel("gemini-2.5-flash") // For Reflect (40% cheaper)
.maxIterations(1)
.build();

// Iteration 1:
// - Turn 1 (Generate): Gemini Pro β†’ $0.019
// - Turn 2 (Reflect): Gemini Flash β†’ $0.0004 (50x cheaper!)
// - Turn 3 (Refine): Gemini Pro β†’ $0.019
// Total: $0.02/page (50% savings!)

Example 3: Dynamic Model Selection (Advanced)​

// Use cheap model for simple pages, premium for complex ones
PageExplanationConfig config = PageExplanationConfig.builder()
.primaryModel(pageComplexity > 0.7
? "gemini-2.5-pro" // Complex: Use premium
: "gemini-2.5-flash") // Simple: Use efficient
.reflectionModel("gemini-2.5-flash") // Always cheap for reflection
.maxIterations(pageComplexity > 0.7 ? 2 : 1) // More iterations for complex pages
.build();

Model Capabilities Matrix​

ModelInput CostOutput CostCached CostStrengthsBest For
Gemini 2.5 Pro ⭐$1.25/1M$5.00/1M$0.315/1MExcellent quality, best cost/perf ratioGeneration, Refinement
Gemini Flash ⭐$0.075/1M$0.30/1M$0.01875/1MExtremely fast and cheap, good analysisReflection, Simple pages
other premium models$3.00/1M$15.00/1M$0.30/1MSuperior creative writingComplex creative tasks
other efficient models$0.25/1M$1.25/1M$0.03/1MFast, cost-effectiveReflection, Scoring
GPT-4 Turbo$10.00/1M$30.00/1MN/ASuperior reasoningVery complex pages only

⭐ Recommended: Gemini 2.5 Pro + Flash combination offers the best balance of quality and cost.

Turn-Specific Model Recommendations​

Turn 1 (Initial Generation):

  • Recommended: Gemini 2.5 Pro ⭐
  • Why: First impression sets tone and structure, quality-critical
  • Advantages: Excellent quality, multimodal, great cost/perf ratio ($1.25/1M input)
  • Avoid: Gemini Flash for initial generation - quality matters more than speed here

Turn 2, 4, 6... (Reflection):

  • Recommended: Gemini Flash ⭐
  • Why: Analytical task, structured output (JSON), minimal creativity needed
  • Advantages: Extremely cheap ($0.075/1M vs $1.25/1M for Pro) = 95% savings
  • Quality Impact: Minimal - reflection is analysis, not creative writing

Turn 3, 5, 7... (Refinement):

  • Recommended: Gemini 2.5 Pro ⭐ (same as Turn 1)
  • Why: Must maintain consistent tone, style, and quality
  • Avoid: Switching between Pro and Flash for generation tasks - causes style inconsistency

ADK (Agent Development Kit) Integration​

Phase 1 MVP: Pragmatic 3-Tool Implementation​

Goal: Ship working feature fast, design for future expansion

Philosophy: Start simple, build sophisticated

  • 3 core tools (not 11)
  • 1 agent (not 8)
  • Sequential execution (parallel later)
  • Full observability from day 1
  • Extensible architecture (add tools incrementally)

Timeline: 1 week to working feature

Overview​

This feature will be implemented using Google's Agent Development Kit (ADK) for Java to leverage proven agent orchestration patterns already established in the codebase.

Maven Dependency:

<dependency>
<groupId>com.google.adk</groupId>
<artifactId>google-adk</artifactId>
<version>0.3.0</version>
</dependency>

Related:

  • ADK Java Source Reference: github/adk-java/ (downloaded for source code reference only, not used directly)
  • Maven Central: google-adk
  • Documentation: ADK Java Docs
  • Issue #257: Custom OpenAPI Toolset and ADK integration
  • Existing Usage: ArchitecturalPlanReviewAgent.java, MultiToolAgent.java

Note: We use the Maven dependency for the actual implementation. The github/adk-java/ folder is downloaded for source code reference and documentation purposes only.

Why ADK for Page Explanation?​

  1. Proven Framework: Already used successfully in ArchitecturalPlanReviewAgent for building code compliance
  2. Multi-Turn Support: Native support for iterative workflows (Generate→Reflect→Refine)
  3. Tool Integration: FunctionTool for custom methods, easy reflection/refinement orchestration
  4. State Management: Built-in session and memory management for multi-turn conversations
  5. Gemini Native: Designed for Gemini models with optimal integration
  6. Event Streaming: RxJava Flowable<Event> for reactive progress tracking
  7. Callbacks: Before/after hooks for logging, cost tracking, quality scoring

Phase 1 MVP: 3-Tool Architecture​

PageExplanationAgent (LlmAgent)
β”œβ”€ Model: gemini-2.5-pro-latest (primary)
β”œβ”€ Temperature: 0.0 (maximum consistency)
β”œβ”€ Instruction: "Orchestrate Generate β†’ Assess β†’ Refine workflow"
β”‚
β”œβ”€ Tools (Phase 1 - Core 3):
β”‚ β”œβ”€ generateExplanation() - Uses Pro, creates initial draft
β”‚ β”œβ”€ assessQuality() - Uses Flash, returns `{score, confidence, gaps}`
β”‚ └─ refineExplanation() - Uses Pro, improves draft
β”‚
β”œβ”€ Tools (Phase 1.5 - Easy Additions):
β”‚ β”œβ”€ extractKeyInsights() - Flash, understand page first
β”‚ └─ checkCompleteness() - Flash, validate coverage
β”‚ (Just add FunctionTool, no architecture change)
β”‚
β”œβ”€ Tools (Phase 2+ - Future):
β”‚ β”œβ”€ searchBuildingCodes() - Flash + RAG
β”‚ β”œβ”€ analyzeVisualElements() - Pro, multimodal
β”‚ β”œβ”€ validateTechnicalTerms() - Flash
β”‚ └─ findRelatedPages() - Flash
β”‚ (Add as needed, architecture supports it)
β”‚
β”œβ”€ Callbacks (Observability):
β”‚ β”œβ”€ afterModelCallback - TrajectoryTrackingCallback
β”‚ └─ afterToolCallback - CostTrackingCallback
β”‚
└─ Session Management: InMemorySessionService

Extensibility Pattern: Tools array is the only change needed to add features!

ADK Implementation Pattern​

Following the established pattern from ArchitecturalPlanReviewAgent:

// Similar to ArchitecturalPlanReviewAgent.java
public class PageExplanationAgent {

// Public static for ADK Dev UI compatibility
public static BaseAgent ROOT_AGENT = initAgent();

public static BaseAgent initAgent() {
return LlmAgent.builder()
.name("page_explanation_agent")
.model("gemini-2.5-pro-latest") // Primary model
.generateContentConfig(
GenerateContentConfig.builder()
.temperature(0.0F) // Maximum predictability and consistency
.build())
.description("Generates professional explanations of architectural plan pages")
.instruction(AGENT_INSTRUCTION)
.tools(
FunctionTool.create(PageExplanationAgent.class, "generateInitialDraft"),
FunctionTool.create(PageExplanationAgent.class, "reflectOnQuality"),
FunctionTool.create(PageExplanationAgent.class, "refineExplanation")
)
.afterModelCallback(new CostTrackingCallback())
.build();
}
}

Multi-Model Support with ADK​

ADK doesn't natively support per-turn model switching, but we can implement it using custom tools with embedded model calls:

public class PageExplanationTools {

private final VertexAiClient vertexAi;

/**
* Tool for reflection using efficient Gemini Flash model.
* This bypasses the agent's primary model to use a cheaper model.
*/
public ReflectionResult reflectOnQuality(
@Schema(description = "The draft explanation to review") String draftMarkdown) {

// Use Gemini Flash for cost savings (not the agent's primary Gemini Pro model)
GenerativeModel flashModel = new GenerativeModel.Builder()
.setModelName("gemini-2.0-flash-exp")
.setVertexAi(vertexAi)
.build();

String reflectionPrompt = buildReflectionPrompt(draftMarkdown);
GenerateContentResponse response = flashModel.generateContent(reflectionPrompt);

// Parse reflection JSON response
return ReflectionResult.fromJson(response.getText());
}
}

Iterative Workflow with ADK​

The ADK agent naturally supports our Generate→Reflect→Refine workflow:

Iteration 1:

  1. Turn 1: Agent calls generateInitialDraft() tool
  2. Turn 2: Agent calls reflectOnQuality() tool (uses Flash model internally)
  3. Turn 3: Agent calls refineExplanation() tool with reflection results

Iteration 2+ (if quality < threshold): 4. Turn 4: Agent calls reflectOnQuality() again 5. Turn 5: Agent calls refineExplanation() again

The agent decides when to stop based on:

  • Quality score from reflection
  • Max iterations reached
  • Instruction-based stopping criteria

ADK Callbacks for Tracking​

public class CostTrackingCallback implements AfterModelCallbackSync {

private final CostAnalysisBuilder costBuilder = new CostAnalysisBuilder();

@Override
public Maybe<Content> call(CallbackContext callbackContext) {
// Extract token usage from Gemini response
UsageMetadata usage = callbackContext.modelResponse().usageMetadata();

// Track per-model costs
costBuilder.addTurn(
callbackContext.invocationContext().getAgent().model(),
usage.getPromptTokenCount(),
usage.getCandidatesTokenCount(),
usage.getCachedContentTokenCount()
);

// Log turn completion
logger.info("Turn {}: {} tokens ({} cached)",
costBuilder.getTurnCount(),
usage.getTotalTokenCount(),
usage.getCachedContentTokenCount());

return Maybe.empty(); // Don't modify content
}
}

Comparison: ADK vs Custom Workflow​

AspectCustom ImplementationADK Implementation
Agent LoopManual orchestrationBuilt-in AutoFlow
Tool CallingCustom logicNative FunctionTool
Model SwitchingDirect API callsTools with embedded models
State ManagementManual trackingSessionService + MemoryService
Event StreamingCustom eventsRxJava Flowable<Event>
Cost TrackingCustomCallbacks + UsageMetadata
TestingCustom harnessADK Dev UI
DebuggingLogs onlyDev UI + Event traces

Maven Configuration​

Add to pom.xml:

<dependencies>
<!-- ADK Core - For agent orchestration -->
<dependency>
<groupId>com.google.adk</groupId>
<artifactId>google-adk</artifactId>
<version>0.3.0</version>
</dependency>

<!-- ADK Dev UI - For local testing (optional) -->
<dependency>
<groupId>com.google.adk</groupId>
<artifactId>google-adk-dev</artifactId>
<version>0.3.0</version>
<scope>provided</scope>
</dependency>

<!-- Vertex AI SDK - For Gemini models -->
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-aiplatform</artifactId>
<version>3.x.x</version>
</dependency>
</dependencies>

Build Command:

export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64
mvn clean install

ADK Best Practices (from existing code)​

Based on ArchitecturalPlanReviewAgent and Issue #257:

  1. Use FunctionTool for Custom Logic:

    FunctionTool.create(PageExplanationAgent.class, "generateInitialDraft")
  2. Leverage OpenAPI for External Services (if needed):

    OpenApiToolset toolset = OpenApiToolset.builder()
    .addOpenApiSpecFromFile("openapi.yaml")
    .baseUrl("http://localhost:8082")
    .build();
  3. Use GenerateContentConfig for Model Settings:

    .generateContentConfig(
    GenerateContentConfig.builder()
    .temperature(0.0F) // Maximum predictability
    .build())
  4. Expose ROOT_AGENT for Dev UI:

    public static BaseAgent ROOT_AGENT = initAgent();
  5. Use InMemoryRunner for Execution:

    InMemoryRunner runner = new InMemoryRunner(ROOT_AGENT);
  6. Handle Events with RxJava:

    Flowable<Event> events = runner.runAsync(userId, sessionId, userMsg);
    events.blockingForEach(event -> processEvent(event));

Implementation Details​

Backend Implementation​

1. PageExplanationService​

File: src/main/java/org/codetricks/construction/code/assistant/understanding/PageExplanationService.java

package org.codetricks.construction.code.assistant.understanding;

import com.google.protobuf.Timestamp;
import org.codetricks.construction.code.assistant.FileSystemHandler;
import org.codetricks.construction.code.assistant.proto.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Service;

import java.time.Instant;
import java.util.Optional;

/**
* Service for generating and managing AI-powered page explanation.
*
* This service orchestrates the agentic workflow that transforms raw plan page
* content (OCR text + PDF) into comprehensive, educational markdown.
*
* Related:
* - PRD: docs/04-prd/plan-page-explanation.md
* - TDD: docs/05-tdd/plan-page-explanation.md
*/
@Service
public class PageExplanationService {

private static final Logger logger = LoggerFactory.getLogger(PageExplanationService.class);

private final FileSystemHandler fileSystemHandler;
private final ProjectPathResolver pathResolver;
private final AgenticPageInterpreter agenticInterpreter;
private final PromptCachingManager cachingManager;

public PageExplanationService(
FileSystemHandler fileSystemHandler,
ProjectPathResolver pathResolver,
AgenticPageInterpreter agenticInterpreter,
PromptCachingManager cachingManager) {
this.fileSystemHandler = fileSystemHandler;
this.pathResolver = pathResolver;
this.agenticInterpreter = agenticInterpreter;
this.cachingManager = cachingManager;
}

/**
* Generate page explanation with agentic workflow.
*/
public GeneratePageExplanationResponse generatePageExplanation(
GeneratePageExplanationRequest request) {

String projectId = request.getProjectId();
String fileId = request.getFileId();
int pageNumber = request.getPageNumber();

logger.info("Starting page explanation generation: project={}, file={}, page={}",
projectId, fileId, pageNumber);

try {
// 1. Check if understanding already exists (unless force regenerate)
if (!request.getForceRegenerate() && understandingExists(projectId, fileId, pageNumber)) {
logger.info("Page understanding already exists, skipping generation");
return GeneratePageExplanationResponse.newBuilder()
.setSuccess(true)
.setStatusMessage("Page understanding already exists")
.setMetadata(loadExistingMetadata(projectId, fileId, pageNumber))
.build();
}

// 2. Load page context (page.md, page.pdf, metadata.json)
PageContext pageContext = loadPageContext(projectId, fileId, pageNumber);

// 3. Update metadata to "processing" status
updateMetadataStatus(projectId, fileId, pageNumber, "processing");

// 4. Run agentic workflow
long startTime = System.currentTimeMillis();

AgenticInterpretationResult result = agenticInterpreter.explainPage(
pageContext,
request.getMaxIterations() > 0 ? request.getMaxIterations() : 3,
request.getModelName().isEmpty() ? null : request.getModelName(),
request.getEnableCaching()
);

long processingTimeMs = System.currentTimeMillis() - startTime;

// 5. Save page-explanation.md
String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
String explanationPath = pageFolderPath + "/page-explanation.md";
fileSystemHandler.writeFile(explanationPath, result.getFinalMarkdown());

// 6. Update metadata with results
PageExplanationMetadata metadata = buildMetadata(
result,
request.getPrimaryModel().isEmpty() ? "gemini-2.5-pro-latest" : request.getPrimaryModel(),
(int) (processingTimeMs / 1000)
);
saveMetadata(projectId, fileId, pageNumber, metadata);

logger.info("Page understanding generated successfully: tokens={}, time={}s",
result.getTotalTokensUsed(), processingTimeMs / 1000);

return GeneratePageExplanationResponse.newBuilder()
.setSuccess(true)
.setStatusMessage("Page understanding generated successfully")
.setMetadata(metadata)
.setTotalTokensUsed(result.getTotalTokensUsed())
.setProcessingTimeSeconds((int) (processingTimeMs / 1000))
.build();

} catch (Exception e) {
logger.error("Failed to generate page explanation", e);

// Update metadata to "failed" status
updateMetadataStatus(projectId, fileId, pageNumber, "failed");

return GeneratePageExplanationResponse.newBuilder()
.setSuccess(false)
.setStatusMessage("Failed to generate page explanation")
.setErrorCode("GENERATION_FAILED")
.setErrorDetails(e.getMessage())
.build();
}
}

/**
* Get page explanation content (for Details tab).
*/
public Optional<String> getPageExplanation(String projectId, String fileId, int pageNumber)
throws PageNotFoundException {
String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
String explanationPath = pageFolderPath + "/page-explanation.md";

if (fileSystemHandler.exists(explanationPath)) {
return Optional.of(fileSystemHandler.readFile(explanationPath));
}
return Optional.empty();
}

/**
* Check if page explanation exists.
*/
private boolean explanationExists(String projectId, String fileId, int pageNumber)
throws PageNotFoundException {
String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
String explanationPath = pageFolderPath + "/page-explanation.md";
return fileSystemHandler.exists(explanationPath);
}

/**
* Load page context (inputs for agentic workflow).
*
* Uses ProjectPathResolver for consistent path resolution.
*/
private PageContext loadPageContext(String projectId, String fileId, int pageNumber)
throws PageNotFoundException {
// Use ProjectPathResolver for consistent path handling
String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);

// Load page.md using ProjectPathResolver
String pageMarkdownPath = pathResolver.getPageMarkdownPath(projectId, pageNumber, fileId);
String pageMarkdown = fileSystemHandler.readFile(pageMarkdownPath);

// Load page.pdf using ProjectPathResolver
String pagePdfPath = pathResolver.getPagePdfPath(projectId, pageNumber, fileId);
byte[] pagePdfBytes = fileSystemHandler.readFileBytes(pagePdfPath);

// Load metadata.json using ProjectPathResolver
String metadataPath = pathResolver.getPageMetadataPath(projectId, pageNumber, fileId);
String metadataJson = fileSystemHandler.exists(metadataPath)
? fileSystemHandler.readFile(metadataPath)
: "{}";

return PageContext.builder()
.projectId(projectId)
.fileId(fileId)
.pageNumber(pageNumber)
.pageMarkdown(pageMarkdown)
.pagePdfBytes(pagePdfBytes)
.metadataJson(metadataJson)
.build();
}

/**
* Build metadata from agentic result.
*/
private PageExplanationMetadata buildMetadata(
AgenticInterpretationResult result,
String modelName,
int processingTimeSeconds) {

return PageExplanationMetadata.newBuilder()
.setStatus("completed")
.setGeneratedAt(Timestamp.newBuilder()
.setSeconds(Instant.now().getEpochSecond())
.build())
.setModel(modelName)
.setIterations(result.getIterationCount())
.setTokensUsed(TokenUsage.newBuilder()
.setInput(result.getTotalInputTokens())
.setOutput(result.getTotalOutputTokens())
.setCached(result.getTotalCachedTokens())
.build())
.setFilePath("page-explanation.md")
.build();
}

/**
* Save metadata to metadata.json.
*/
private void saveMetadata(String projectId, String fileId, int pageNumber,
PageExplanationMetadata metadata) throws PageNotFoundException {
String metadataPath = pathResolver.getPageMetadataPath(projectId, pageNumber, fileId);

// Read existing metadata, update understanding section, write back
// (Implementation uses JSON merging logic - omitted for brevity)

logger.info("Saved page explanation metadata: {}", metadataPath);
}

/**
* Update metadata status only.
*/
private void updateMetadataStatus(String projectId, String fileId, int pageNumber, String status) {
// Similar to saveMetadata but only updates status field
logger.info("Updated page explanation status to: {}", status);
}

/**
* Load existing metadata (if already generated).
*/
private PageExplanationMetadata loadExistingMetadata(String projectId, String fileId, int pageNumber) {
// Load from metadata.json and parse understanding section
// (Implementation omitted for brevity)
return PageExplanationMetadata.newBuilder()
.setStatus("completed")
.build();
}
}

2. AgenticPageInterpreter (Agentic Workflow Engine)​

File: src/main/java/org/codetricks/construction/code/assistant/understanding/AgenticPageInterpreter.java

package org.codetricks.construction.code.assistant.understanding;

import org.codetricks.construction.code.assistant.llm.LLMClient;
import org.codetricks.construction.code.assistant.llm.PromptCachingManager;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;

import java.util.ArrayList;
import java.util.List;

/**
* Agentic workflow engine for iterative page explanation generation.
*
* Implements a multi-turn ReAct-inspired loop:
* 1. Generate initial understanding draft
* 2. Reflect on draft quality (identify gaps)
* 3. Refine draft with reflection insights
*
* Uses prompt caching for PDF images across turns to reduce costs.
*/
@Component
public class AgenticPageInterpreter {

private static final Logger logger = LoggerFactory.getLogger(AgenticPageInterpreter.class);

private final LLMClient llmClient;
private final PromptCachingManager cachingManager;
private final PromptTemplateLoader promptLoader;

public AgenticPageInterpreter(
LLMClient llmClient,
PromptCachingManager cachingManager,
PromptTemplateLoader promptLoader) {
this.llmClient = llmClient;
this.cachingManager = cachingManager;
this.promptLoader = promptLoader;
}

/**
* Interpret architectural plan page with multi-turn agentic workflow.
*
* @param pageContext Input context (page.md, page.pdf, metadata)
* @param maxIterations Maximum number of turns (default: 3)
* @param maxIterations Maximum number of complete workflow cycles (default: 1)
* @param enableCaching Use prompt caching for PDF (default: true)
* @return Final understanding markdown and metrics
*/
public AgenticInterpretationResult explainPage(
PageContext pageContext,
int maxIterations,
String modelName,
boolean enableCaching) {

logger.info("Starting agentic explanation: project={}, file={}, page={}, maxIter={}",
pageContext.getProjectId(), pageContext.getFileId(),
pageContext.getPageNumber(), maxIterations);

// Initialize result tracking
AgenticInterpretationResult.Builder resultBuilder = AgenticInterpretationResult.builder();
List<CostAnalysisMetadata> turnCostsList = new ArrayList<>();

String currentDraft = null;
String reflectionNotes = null;

try {
// TURN 1: Generate initial understanding
logger.info("TURN 1: Generating initial explanation draft");
TurnResult turn1 = generateInitialDraft(pageContext, primaryModel, enableCaching);
currentDraft = turn1.getOutput();
turnCostsList.add(turn1.getCostAnalysis());

logger.info("TURN 1 completed: {} tokens ({} cached), cost: ${}",
turn1.getCostAnalysis().getTotalTokens(),
turn1.getCostAnalysis().getTokenBreakdown().getCachedContent().getTokenCount(),
turn1.getCostAnalysis().getEstimatedTotalCostUsd());

// If max phases_completed is 1, skip reflection and refinement
if (maxIterations <= 1) {
logger.info("Max phases_completed = 1, returning initial draft");
return buildFinalResult(currentDraft, turnMetricsList);
}

// TURN 2: Reflect on draft quality
logger.info("TURN 2: Reflecting on draft quality");
TurnResult turn2 = reflectOnDraft(currentDraft, reflectionModel != null ? reflectionModel : primaryModel);
reflectionNotes = turn2.getOutput();
turnCostsList.add(turn2.getCostAnalysis());

logger.info("TURN 2 completed: identified improvement areas, cost: ${}",
turn2.getCostAnalysis().getEstimatedTotalCostUsd());

// If max phases_completed is 2, return draft with reflection logged
if (maxIterations <= 2) {
logger.info("Max phases_completed = 2, returning draft after reflection");
return buildFinalResult(currentDraft, turnMetricsList);
}

// TURN 3: Refine with reflection insights
logger.info("TURN 3: Refining draft with reflection insights");
TurnResult turn3 = refineWithReflection(
pageContext, currentDraft, reflectionNotes, primaryModel, enableCaching);
currentDraft = turn3.getOutput();
turnCostsList.add(turn3.getCostAnalysis());

logger.info("TURN 3 completed: final explanation generated, cost: ${}",
turn3.getCostAnalysis().getEstimatedTotalCostUsd());

// Additional phases_completed (if maxIterations > 3)
for (int i = 4; i <= maxIterations; i++) {
logger.info("TURN {}: Additional refinement iteration", i);

// Reflect again
TurnResult reflectAgain = reflectOnDraft(currentDraft, modelName);
reflectionNotes = reflectAgain.getOutput();
turnMetricsList.add(reflectAgain.getMetrics());

// Refine again
TurnResult refineAgain = refineWithReflection(
pageContext, currentDraft, reflectionNotes, modelName, enableCaching);
currentDraft = refineAgain.getOutput();
turnMetricsList.add(refineAgain.getMetrics());

logger.info("TURN {} completed", i);
}

return buildFinalResult(currentDraft, turnMetricsList);

} catch (Exception e) {
logger.error("Agentic explanation failed", e);
throw new RuntimeException("Failed to interpret page", e);
}
}

/**
* TURN 1: Generate initial understanding draft.
*/
private TurnResult generateInitialDraft(
PageContext pageContext,
String modelName,
boolean enableCaching) {

// Load prompt template
String promptTemplate = promptLoader.loadTemplate("page-understanding-generate.txt");

// Build prompt with page context
String prompt = promptTemplate
.replace("{{PAGE_MARKDOWN}}", pageContext.getPageMarkdown())
.replace("{{PROJECT_ID}}", pageContext.getProjectId())
.replace("{{FILE_ID}}", pageContext.getFileId())
.replace("{{PAGE_NUMBER}}", String.valueOf(pageContext.getPageNumber()));

// Prepare LLM request with PDF image
// Uses PRIMARY model for quality-critical generation
LLMRequest request = LLMRequest.builder()
.modelName(primaryModel)
.systemPrompt(promptLoader.loadTemplate("page-understanding-system.txt"))
.userPrompt(prompt)
.imageBytes(pageContext.getPagePdfBytes())
.imageMediaType("application/pdf")
.enableCaching(enableCaching)
.cacheImageMarker(true) // Mark PDF for caching
.maxTokens(4000)
.temperature(0.0) // Maximum predictability and consistency
.build();

// Call LLM
LLMResponse response = llmClient.generate(request);

// Return result with CostAnalysisMetadata
return TurnResult.builder()
.turnNumber(1)
.output(response.getContent())
.costAnalysis(response.getCostAnalysisMetadata()) // From LLM response
.build();
}

/**
* TURN 2: Reflect on draft quality.
*
* MODEL SELECTION: Uses REFLECTION model (cost-optimizable)
* - This turn performs analytical quality assessment
* - Requires: Structured analysis, gap identification, scoring
* - Best models: other efficient models, Gemini Flash (efficient tier)
* - Cost: ~$0.002/turn (10-20x cheaper than premium)
*
* OPTIMIZATION OPPORTUNITY:
* Using Gemini Flash instead of Pro for reflection saves ~50% on total cost
* with minimal quality impact (reflection is analytical, not creative).
*/
private TurnResult reflectOnDraft(String draft, String reflectionModel) {

// Load reflection prompt
String promptTemplate = promptLoader.loadTemplate("page-understanding-reflect.txt");
String prompt = promptTemplate.replace("{{DRAFT_MARKDOWN}}", draft);

// Prepare LLM request (no image needed for reflection)
// Uses REFLECTION model (can be cheaper than primary for cost savings)
LLMRequest request = LLMRequest.builder()
.modelName(reflectionModel)
.systemPrompt(promptLoader.loadTemplate("page-understanding-reflect-system.txt"))
.userPrompt(prompt)
.maxTokens(2000)
.temperature(0.0) // Consistent temperature for predictability
.build();

// Call LLM
LLMResponse response = llmClient.generate(request);

// Return result with CostAnalysisMetadata
return TurnResult.builder()
.turnNumber(2)
.output(response.getContent())
.costAnalysis(response.getCostAnalysisMetadata()) // From LLM response
.build();
}

/**
* TURN 3+: Refine draft with reflection insights.
*
* MODEL SELECTION: Uses PRIMARY model (quality-critical)
* - This turn improves explanation based on reflection feedback
* - Requires: Creative refinement, maintaining tone, addressing gaps
* - Best models: other premium models, GPT-4 (premium tier)
* - Cost: ~$0.05/turn (expensive but essential for quality)
*
* Note: Must use same model as Turn 1 (Generate) to maintain consistent
* writing style, tone, and quality throughout the explanation.
*/
private TurnResult refineWithReflection(
PageContext pageContext,
String draft,
String reflectionNotes,
String primaryModel,
boolean enableCaching) {

// Load refinement prompt
String promptTemplate = promptLoader.loadTemplate("page-understanding-refine.txt");
String prompt = promptTemplate
.replace("{{DRAFT_MARKDOWN}}", draft)
.replace("{{REFLECTION_NOTES}}", reflectionNotes)
.replace("{{PAGE_MARKDOWN}}", pageContext.getPageMarkdown());

// Prepare LLM request with PDF image (reuse cache)
// Uses PRIMARY model for quality-critical refinement
LLMRequest request = LLMRequest.builder()
.modelName(primaryModel)
.systemPrompt(promptLoader.loadTemplate("page-understanding-system.txt"))
.userPrompt(prompt)
.imageBytes(pageContext.getPagePdfBytes())
.imageMediaType("application/pdf")
.enableCaching(enableCaching)
.cacheImageMarker(true) // Reuse cached PDF
.maxTokens(5000)
.temperature(0.0) // Maximum predictability
.build();

// Call LLM
LLMResponse response = llmClient.generate(request);

// Return result with CostAnalysisMetadata
return TurnResult.builder()
.turnNumber(3)
.output(response.getContent())
.costAnalysis(response.getCostAnalysisMetadata()) // From LLM response
.build();
}

/**
* Build final result from turn cost analyses.
*
* Aggregates CostAnalysisMetadata from all turns using MetaCostAnalysis
* for proper per-model cost tracking.
*/
private AgenticInterpretationResult buildFinalResult(
String finalMarkdown,
List<CostAnalysisMetadata> turnCostsList) {

// Use MetaCostAnalysis to aggregate per-model costs
MetaCostAnalysis.Builder metaBuilder = MetaCostAnalysis.newBuilder();

double totalCost = 0.0;
int totalTokens = 0;

// Aggregate by model
Map<String, CostAnalysisMetadata.Builder> modelCosts = new HashMap<>();

for (CostAnalysisMetadata turnCost : turnCostsList) {
String model = turnCost.getModel();
totalCost += turnCost.getEstimatedTotalCostUsd();
totalTokens += turnCost.getTotalTokens();

// Merge into per-model aggregation
// (Implementation omitted for brevity - would merge token breakdowns)
}

MetaCostAnalysis metaCost = metaBuilder
.setTotalCostUsd(totalCost)
.setTotalTokens(totalTokens)
.setPrimaryModel(turnCostsList.get(0).getModel())
.build();

return AgenticInterpretationResult.builder()
.finalMarkdown(finalMarkdown)
.iterationCount(turnCostsList.size())
.metaCostAnalysis(metaCost)
.turnCosts(turnCostsList)
.build();
}
}

3. Prompt Templates​

File: src/main/resources/prompts/page-understanding-generate.txt

You are an expert architectural plan interpreter. Your task is to generate a comprehensive, 
educational explanation of an architectural plan page that makes it accessible to beginners
and non-industry experts.

# Input Context

**Project ID:** {{PROJECT_ID}}
**File ID:** {{FILE_ID}}
**Page Number:** {{PAGE_NUMBER}}

## Raw OCR Text

{{PAGE_MARKDOWN}}

## PDF Image

[Attached: Full-resolution PDF page image]

# Your Task

Generate a rich, educational markdown document that explains this plan page comprehensively.
Your explanation should:

1. **Be Beginner-Friendly**: Use simple language, define technical terms inline, and explain
architectural concepts as if teaching someone new to the field.

2. **Be Comprehensive**: Cover all major elements visible on the page - drawings, tables,
legends, annotations, title blocks, etc.

3. **Explain Relationships**: Show how elements connect (e.g., how zoning requirements affect
building setbacks, how legends map to drawing symbols).

4. **Provide Context**: Explain what each section means in the broader context of construction
and building codes.

5. **Use Rich Markdown**: Structure with headings, lists, tables, blockquotes for definitions,
and emphasis where helpful.

6. **Include Visual Descriptions**: Describe what the drawings show, not just the text.

# Output Format

Generate markdown with the following structure:

```markdown
# [Page Title/Name]

## Overview
Brief introduction to what this page contains and its purpose.

## Key Information

### [Section 1]
Detailed explanation of the first major section...

**Technical Term**: Definition inline for beginners.

### [Section 2]
...

## Understanding the Drawings

Describe visual elements, symbols, and what they represent...

## Architectural Concepts Explained

Explain any complex concepts for beginners...

## Code Compliance Considerations

If relevant, explain how this relates to building codes...

## Summary

Recap the most important takeaways from this page.

Guidelines

  • Assume the reader has NO architectural background
  • Define ALL technical terms when first used
  • Use analogies when explaining complex concepts
  • Be thorough but not overwhelming
  • Focus on understanding, not just description
  • Make it educational and engaging

Generate the educational markdown now:


**File**: `src/main/resources/prompts/page-understanding-reflect.txt`

```text
You are a quality reviewer for educational architectural content. Your task is to review
the following page explanation draft and identify areas for improvement.

# Draft to Review

{{DRAFT_MARKDOWN}}

# Your Task

Analyze this draft and identify:

1. **Gaps in Coverage**: What important elements from the page are missing or under-explained?

2. **Clarity Issues**: Where is the language unclear, too technical, or confusing for beginners?

3. **Missing Context**: Where could relationships between elements be explained better?

4. **Definition Gaps**: Are there technical terms that need inline definitions?

5. **Structure Issues**: Could the organization be improved for better readability?

6. **Educational Value**: Where could the content be more engaging or educational?

# Output Format

Provide your reflection as structured JSON:

```json
{
"gaps": [
"Missing explanation of X",
"Section Y needs more detail on Z"
],
"clarity_issues": [
"Term 'ABC' is not defined",
"Paragraph about DEF is too technical"
],
"missing_context": [
"Relationship between X and Y not explained",
"How Z affects building design unclear"
],
"structure_suggestions": [
"Consider adding a subsection for X",
"Reorder sections Y and Z for better flow"
],
"overall_assessment": "Brief summary of draft quality and main improvement areas"
}

Generate your reflection now:


**File**: `src/main/resources/prompts/page-understanding-refine.txt`

```text
You are an expert architectural plan interpreter refining an educational explanation based
on quality feedback.

# Original Draft

{{DRAFT_MARKDOWN}}

# Reflection and Improvement Areas

{{REFLECTION_NOTES}}

# Original Raw Content (for reference)

{{PAGE_MARKDOWN}}

## PDF Image (for reference)

[Attached: Full-resolution PDF page image]

# Your Task

Improve the draft by addressing the identified issues:

1. Fill gaps in coverage
2. Clarify unclear sections
3. Add missing context and relationships
4. Define missing technical terms inline
5. Improve structure if needed
6. Enhance educational value

# Guidelines

- Keep what works well in the original draft
- Focus improvements on the identified issues
- Maintain beginner-friendly language
- Ensure comprehensive coverage
- Make it engaging and educational

# Output Format

Generate the improved markdown (full document, not just changes):

```markdown
[Your improved, comprehensive page explanation here]

Generate the refined understanding now:


### Frontend Implementation

#### 1. Update PageViewerComponent

**File**: `web-ng-m3/src/app/components/page-viewer/page-viewer.component.ts`

```typescript
import { Component, OnInit, Input } from '@angular/core';
import { ArchitecturalPlanService } from '../../shared/architectural-plan.service';
import { ArchitecturalPlanPage, PageExplanationMetadata } from '../../shared/proto/api';

@Component({
selector: 'app-page-viewer',
templateUrl: './page-viewer.component.html',
styleUrls: ['./page-viewer.component.scss']
})
export class PageViewerComponent implements OnInit {
@Input() projectId!: string;
@Input() fileId!: string;
@Input() pageNumber!: number;

// Tab state
selectedTabIndex = 0; // 0: Overview, 1: Preview, 2: Compliance, 3: Details (NEW)

// Page data
page?: ArchitecturalPlanPage;

// Details tab state (NEW)
understandingMarkdown?: string;
understandingMetadata?: PageExplanationMetadata;
understandingLoading = false;
understandingError?: string;

constructor(private planService: ArchitecturalPlanService) {}

ngOnInit(): void {
this.loadPage();
}

/**
* Load page data (including understanding if available).
*/
loadPage(): void {
this.planService.getArchitecturalPlanPage(
this.projectId,
this.fileId,
this.pageNumber,
true // include_understanding = true
).subscribe({
next: (page) => {
this.page = page;

// Check if understanding is available
if (page.hasUnderstanding && page.understandingMarkdown) {
this.understandingMarkdown = page.understandingMarkdown;
this.understandingMetadata = page.understandingMetadata;
}
},
error: (err) => {
console.error('Failed to load page', err);
}
});
}

/**
* Handle Details tab selection (lazy load if needed).
*/
onTabChange(tabIndex: number): void {
this.selectedTabIndex = tabIndex;

// If Details tab (index 3) and understanding not loaded yet
if (tabIndex === 3 && !this.understandingMarkdown && !this.understandingError) {
this.loadUnderstanding();
}
}

/**
* Load page explanation (triggered by Details tab selection).
*/
loadUnderstanding(): void {
// If metadata says it's processing, show loading state
if (this.understandingMetadata?.status === 'processing') {
this.understandingLoading = true;
// Poll for completion (or use WebSocket for real-time updates)
this.pollUnderstandingStatus();
return;
}

// If metadata says it's failed, show error
if (this.understandingMetadata?.status === 'failed') {
this.understandingError = 'Failed to generate page explanation';
return;
}

// Otherwise, trigger generation if not exists
if (!this.page?.hasUnderstanding) {
this.generateUnderstanding();
}
}

/**
* Trigger page explanation generation.
*/
generateUnderstanding(): void {
this.understandingLoading = true;
this.understandingError = undefined;

this.planService.generatePageExplanation(
this.projectId,
this.fileId,
this.pageNumber
).subscribe({
next: (response) => {
if (response.success) {
// Reload page to get understanding content
this.loadPage();
} else {
this.understandingError = response.statusMessage;
}
this.understandingLoading = false;
},
error: (err) => {
console.error('Failed to generate understanding', err);
this.understandingError = 'Failed to generate page explanation';
this.understandingLoading = false;
}
});
}

/**
* Poll for understanding generation completion.
*/
pollUnderstandingStatus(): void {
const pollInterval = setInterval(() => {
this.planService.getPageExplanationStatus(
this.projectId,
this.fileId,
this.pageNumber
).subscribe({
next: (metadata) => {
if (metadata.status === 'completed') {
clearInterval(pollInterval);
this.loadPage(); // Reload to get content
this.understandingLoading = false;
} else if (metadata.status === 'failed') {
clearInterval(pollInterval);
this.understandingError = 'Failed to generate page explanation';
this.understandingLoading = false;
}
},
error: (err) => {
console.error('Failed to poll status', err);
clearInterval(pollInterval);
this.understandingError = 'Failed to check generation status';
this.understandingLoading = false;
}
});
}, 5000); // Poll every 5 seconds
}
}

File: web-ng-m3/src/app/components/page-viewer/page-viewer.component.html

<mat-card class="page-viewer-card">
<!-- Tab Group with NEW Details tab -->
<mat-tab-group [(selectedIndex)]="selectedTabIndex" (selectedTabChange)="onTabChange($event.index)">

<!-- Overview Tab (existing) -->
<mat-tab label="Overview">
<div class="tab-content">
<app-page-overview [page]="page"></app-page-overview>
</div>
</mat-tab>

<!-- Preview Tab (existing) -->
<mat-tab label="Preview">
<div class="tab-content">
<app-page-preview [page]="page"></app-page-preview>
</div>
</mat-tab>

<!-- Compliance Tab (existing) -->
<mat-tab label="Compliance">
<div class="tab-content">
<app-page-compliance [page]="page"></app-page-compliance>
</div>
</mat-tab>

<!-- Details Tab (NEW) -->
<mat-tab label="Details">
<div class="tab-content details-tab">

<!-- Loading State -->
<div *ngIf="understandingLoading" class="loading-state">
<mat-spinner diameter="40"></mat-spinner>
<p>Generating detailed page explanation...</p>
<p class="loading-hint">This may take 1-2 minutes. AI is analyzing the page content.</p>
</div>

<!-- Error State -->
<div *ngIf="understandingError && !understandingLoading" class="error-state">
<mat-icon color="warn">error</mat-icon>
<p>{{ understandingError }}</p>
<button mat-raised-button color="primary" (click)="generateUnderstanding()">
<mat-icon>refresh</mat-icon> Retry
</button>
</div>

<!-- Content State -->
<div *ngIf="understandingMarkdown && !understandingLoading" class="understanding-content">
<!-- Metadata Banner -->
<div class="metadata-banner">
<mat-icon>auto_awesome</mat-icon>
<span>AI-generated explanation</span>
<span class="metadata-details">
Generated {{ understandingMetadata?.generatedAt | date:'short' }} |
{{ understandingMetadata?.phases_completedCompleted }} phases_completed
</span>
</div>

<!-- Markdown Content -->
<markdown [data]="understandingMarkdown" class="markdown-content"></markdown>
</div>

<!-- Empty State (no understanding available, not generating) -->
<div *ngIf="!understandingMarkdown && !understandingLoading && !understandingError" class="empty-state">
<mat-icon>description</mat-icon>
<h3>Details Not Yet Generated</h3>
<p>AI-powered page explanation has not been generated for this page yet.</p>
<button mat-raised-button color="primary" (click)="generateUnderstanding()">
<mat-icon>auto_awesome</mat-icon> Generate Details
</button>
</div>

</div>
</mat-tab>

</mat-tab-group>
</mat-card>

File: web-ng-m3/src/app/components/page-viewer/page-viewer.component.scss

.page-viewer-card {
margin: 16px;
}

.tab-content {
padding: 24px;
min-height: 400px;
}

.details-tab {
.loading-state,
.error-state,
.empty-state {
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
min-height: 400px;
text-align: center;

mat-icon {
font-size: 48px;
width: 48px;
height: 48px;
margin-bottom: 16px;
}

p {
margin: 8px 0;
color: #666;
}

.loading-hint {
font-size: 0.875rem;
font-style: italic;
}

button {
margin-top: 16px;
}
}

.understanding-content {
.metadata-banner {
display: flex;
align-items: center;
gap: 8px;
padding: 12px 16px;
background-color: #e3f2fd;
border-left: 4px solid #2196f3;
margin-bottom: 24px;
border-radius: 4px;

mat-icon {
color: #2196f3;
}

.metadata-details {
margin-left: auto;
font-size: 0.875rem;
color: #666;
}
}

.markdown-content {
// Markdown styling
font-size: 1rem;
line-height: 1.6;

h1, h2, h3, h4, h5, h6 {
margin-top: 1.5em;
margin-bottom: 0.5em;
font-weight: 600;
}

h1 { font-size: 2rem; border-bottom: 2px solid #e0e0e0; padding-bottom: 0.3em; }
h2 { font-size: 1.5rem; border-bottom: 1px solid #e0e0e0; padding-bottom: 0.3em; }
h3 { font-size: 1.25rem; }
h4 { font-size: 1.1rem; }

p {
margin-bottom: 1em;
}

ul, ol {
margin-bottom: 1em;
padding-left: 2em;
}

li {
margin-bottom: 0.5em;
}

table {
width: 100%;
border-collapse: collapse;
margin-bottom: 1em;

th, td {
border: 1px solid #e0e0e0;
padding: 8px 12px;
text-align: left;
}

th {
background-color: #f5f5f5;
font-weight: 600;
}
}

blockquote {
border-left: 4px solid #2196f3;
padding-left: 16px;
margin-left: 0;
color: #666;
font-style: italic;
}

code {
background-color: #f5f5f5;
padding: 2px 6px;
border-radius: 3px;
font-family: 'Courier New', monospace;
font-size: 0.9em;
}

pre {
background-color: #f5f5f5;
padding: 12px;
border-radius: 4px;
overflow-x: auto;

code {
background-color: transparent;
padding: 0;
}
}

strong, b {
font-weight: 600;
color: #000;
}
}
}
}

CLI Tools​

Local Generation Script​

File: scripts/generate-page-explanation.sh

#!/bin/bash

################################################################################
# Generate Page Understanding (Local Development)
#
# Generates AI-powered page explanation for architectural plan pages using
# local project folders. Supports rapid iteration without cloud deployments.
#
# Usage:
# ./scripts/generate-page-explanation.sh --project-path=PATH [OPTIONS]
#
# Examples:
# # Generate for all pages in a project
# ./scripts/generate-page-explanation.sh \
# --project-path=projects/R2024.0091-2024-10-14
#
# # Generate for specific file and pages
# ./scripts/generate-page-explanation.sh \
# --project-path=projects/R2024.0091-2024-10-14 \
# --file-id=1 \
# --page-numbers=1,2,3
#
# # Force regeneration with verbose logging
# ./scripts/generate-page-explanation.sh \
# --project-path=projects/R2024.0091-2024-10-14 \
# --force \
# --verbose
#
# Prerequisites:
# - Java 17+ (Temurin 23 in dev container)
# - Maven 3.8+
# - Vertex AI credentials configured
# - Project structure: files/{file_id}/pages/{page_number}/
#
# What it does:
# 1. Validates project path and structure
# 2. Discovers pages to process
# 3. Calls PageExplanationService for each page
# 4. Generates page-explanation.md files
# 5. Updates metadata.json with generation status
# 6. Outputs summary (pages processed, tokens used, time)
################################################################################

set -e # Exit on any error

# Color codes for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Helper functions
log_info() { echo -e "${BLUE}ℹ️ $1${NC}"; }
log_success() { echo -e "${GREEN}βœ… $1${NC}"; }
log_warning() { echo -e "${YELLOW}⚠️ $1${NC}"; }
log_error() { echo -e "${RED}❌ $1${NC}"; }
log_section() {
echo ""
echo -e "${BLUE}================================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}================================================${NC}"
echo ""
}

# Default values
PROJECT_PATH=""
FILE_ID=""
PAGE_NUMBERS=""
FORCE=false
VERBOSE=false
MAX_ITERATIONS=3

# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
--project-path=*)
PROJECT_PATH="${1#*=}"
shift
;;
--file-id=*)
FILE_ID="${1#*=}"
shift
;;
--page-numbers=*)
PAGE_NUMBERS="${1#*=}"
shift
;;
--force)
FORCE=true
shift
;;
--verbose)
VERBOSE=true
shift
;;
--max-phases_completed=*)
MAX_ITERATIONS="${1#*=}"
shift
;;
*)
log_error "Unknown argument: $1"
exit 1
;;
esac
done

# Validate required arguments
if [ -z "$PROJECT_PATH" ]; then
log_error "Missing required argument: --project-path"
echo "Usage: $0 --project-path=PATH [OPTIONS]"
exit 1
fi

# Validate project path exists
if [ ! -d "$PROJECT_PATH" ]; then
log_error "Project path does not exist: $PROJECT_PATH"
exit 1
fi

log_section "Page Understanding Generation"

log_info "Project: $PROJECT_PATH"
log_info "File ID: ${FILE_ID:-all files}"
log_info "Page Numbers: ${PAGE_NUMBERS:-all pages}"
log_info "Force Regenerate: $FORCE"
log_info "Verbose Logging: $VERBOSE"
log_info "Max Iterations: $MAX_ITERATIONS"

# Build Java command
log_section "Building Maven Project"

export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64
mvn clean install -DskipTests

# Run generation (using Spring Boot CLI runner or direct service call)
log_section "Generating Page Understanding"

java -cp "target/classes:target/dependency/*" \
org.codetricks.construction.code.assistant.cli.GeneratePageExplanationCLI \
--project-path="$PROJECT_PATH" \
--file-id="$FILE_ID" \
--page-numbers="$PAGE_NUMBERS" \
--force="$FORCE" \
--verbose="$VERBOSE" \
--max-phases_completed="$MAX_ITERATIONS"

log_success "Page understanding generation complete!"

Project Upgrade and Generation Script​

File: scripts/upgrade-project-and-generate.sh

#!/bin/bash

################################################################################
# Upgrade Project to Multi-File Structure and Generate Understanding
#
# Combines project upgrade with page explanation generation for testing.
#
# Usage:
# ./scripts/upgrade-project-and-generate.sh \
# --source-project=SOURCE \
# --target-project=TARGET
#
# Example:
# ./scripts/upgrade-project-and-generate.sh \
# --source-project=projects/R2024.0091-2024-10-14 \
# --target-project=projects/R2024.0091-test-copy
################################################################################

set -e

# ... (Similar structure to above, omitted for brevity) ...

# 1. Copy project
log_section "Copying Project"
cp -r "$SOURCE_PROJECT" "$TARGET_PROJECT"

# 2. Upgrade to multi-file structure
log_section "Upgrading to Multi-File Structure"
./scripts/migrate-to-multi-file.sh --project-path="$TARGET_PROJECT"

# 3. Generate page explanation
log_section "Generating Page Understanding"
./scripts/generate-page-explanation.sh --project-path="$TARGET_PROJECT"

log_success "Project upgraded and understanding generated!"

Deployment Guide​

Step 1: Build and Test Locally​

# 1. Set Java environment
export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64

# 2. Build project
mvn clean install

# 3. Run unit tests
mvn test -Dtest=PageExplanationServiceTest

# 4. Test local generation
./scripts/generate-page-explanation.sh \
--project-path=projects/R2024.0091-2024-10-14 \
--file-id=1 \
--page-numbers=3 \
--verbose

Step 2: Deploy Backend to Cloud Run​

# 1. Build Docker image
gcloud builds submit --tag gcr.io/PROJECT_ID/architectural-plan-service

# 2. Deploy to Cloud Run
gcloud run deploy architectural-plan-service \
--image gcr.io/PROJECT_ID/architectural-plan-service \
--platform managed \
--region us-central1 \
--allow-unauthenticated

# 3. Verify deployment
curl https://YOUR_CLOUD_RUN_URL/health

Step 3: Deploy Frontend to Cloud Storage​

# 1. Build Angular app
cd web-ng-m3
npm run build

# 2. Deploy to Cloud Storage
gsutil -m rsync -r -d dist/web-ng-m3 gs://YOUR_BUCKET/

# 3. Invalidate CDN cache (if using Cloud CDN)
gcloud compute url-maps invalidate-cdn-cache URL_MAP_NAME --path "/*"

Step 4: Test End-to-End​

  1. Open application in browser
  2. Navigate to a plan page
  3. Click "Details" tab
  4. Verify understanding generation or display
  5. Check browser console for errors
  6. Verify markdown rendering

Performance Optimizations​

1. Prompt Caching​

// Cache PDF image across all turns to reduce costs by 90%
LLMRequest request = LLMRequest.builder()
.imageBytes(pagePdfBytes)
.enableCaching(true)
.cacheImageMarker(true) // Mark for caching
.build();

// Subsequent turns reuse cached image
// Cost: $0.30/1M tokens (cached) vs $3.00/1M (non-cached)

2. Batch Processing​

// Process multiple pages with same PDF (file-level batching)
public void batchGenerateForFile(String projectId, String fileId, List<Integer> pageNumbers) {
// Load PDF once
byte[] filePdfBytes = loadFilePdf(projectId, fileId);

// Cache PDF at file level
String cacheKey = cachingManager.cachePdf(filePdfBytes);

// Process each page with cached PDF
for (int pageNumber : pageNumbers) {
processPage(projectId, fileId, pageNumber, cacheKey);
}
}

3. Asynchronous Processing​

// Use Cloud Run Jobs for background processing
@Async
public CompletableFuture<GeneratePageExplanationResponse> generateAsync(
GeneratePageExplanationRequest request) {

GeneratePageExplanationResponse response = generatePageExplanation(request);
return CompletableFuture.completedFuture(response);
}

Security Implementation​

1. Access Control​

// Verify user has access to project before generating
@PreAuthorize("hasProjectAccess(#request.projectId)")
public GeneratePageExplanationResponse generatePageExplanation(
GeneratePageExplanationRequest request) {
// ...
}

2. Rate Limiting​

// Limit generation requests per user to prevent abuse
@RateLimited(maxRequests = 10, windowSeconds = 3600)
public GeneratePageExplanationResponse generatePageExplanation(
GeneratePageExplanationRequest request) {
// ...
}

Observability and Trajectory Tracking​

Agent Trajectory Capture​

Implementation: Capture complete agent execution trace using AgentTrajectory proto.

File: src/main/proto/features/agent_trajectory.proto

Integration with Existing Infrastructure:

  • Reuses existing LlmTrace for individual LLM calls
  • Extends with iteration and tool tracking
  • Stores in BigQuery + GCS for dual access patterns

Trajectory Builder​

public class AgentTrajectoryBuilder {
private final String trajectoryId;
private final List<AgentIteration> iterations = new ArrayList<>();
private final Instant startedAt;

public void startIteration(int iterationNumber) {
currentIteration = AgentIteration.newBuilder()
.setIterationNumber(iterationNumber)
.setStartedAt(Timestamps.fromMillis(System.currentTimeMillis()))
.build();
}

public void recordLlmTurn(int turnNumber, String phaseName, LlmTrace llmTrace) {
AgentTurn turn = AgentTurn.newBuilder()
.setTurnNumber(turnNumber)
.setIterationNumber(currentIteration.getIterationNumber())
.setTurnType(TurnType.LLM_CALL)
.setLlmTurn(LlmTurn.newBuilder()
.setModelName(llmTrace.getModelName())
.setPhaseName(phaseName)
.setLlmTrace(llmTrace)
.setInputTokens(llmTrace.getUsageMetadata().getPromptTokenCount())
.setOutputTokens(llmTrace.getUsageMetadata().getCandidatesTokenCount())
.setCachedTokens(llmTrace.getUsageMetadata().getCachedContentTokenCount())
.build())
.build();

currentIterationTurns.add(turn);
}

public AgentTrajectory build() {
return AgentTrajectory.newBuilder()
.setTrajectoryId(trajectoryId)
.addAllIterations(iterations)
.setTotalTurns(getTotalTurnCount())
.setCostAnalysis(aggregateCosts())
.build();
}
}

ADK Callbacks for Trajectory Capture​

public class TrajectoryTrackingCallback implements AfterModelCallbackSync {
private final AgentTrajectoryBuilder trajectoryBuilder;
private int turnCounter = 0;

@Override
public Maybe<Content> call(CallbackContext ctx) {
turnCounter++;

// Extract LLM trace from context
LlmTrace llmTrace = buildLlmTraceFromContext(ctx);

// Determine phase from agent state
String phase = determinePhaseName(ctx); // "GENERATE", "REFLECT", "REFINE"

// Record in trajectory
trajectoryBuilder.recordLlmTurn(turnCounter, phase, llmTrace);

// Also log to BigQuery (existing infrastructure)
llmLogTracer.logTrace(llmTrace);

return Maybe.empty();
}

private String determinePhaseName(CallbackContext ctx) {
// Analyze tool calls or prompt content to determine phase
// Look for keywords: "generate", "reflect", "refine"
String lastToolCalled = ctx.invocationContext().getLastToolName();
if (lastToolCalled.contains("generate")) return "GENERATE";
if (lastToolCalled.contains("reflect")) return "REFLECT";
if (lastToolCalled.contains("refine")) return "REFINE";
return "UNKNOWN";
}
}

Trajectory Storage​

GCS Path: projects/{projectId}/traces/page-explanation/{trajectory_id}.json

JSON Export:

public String exportTrajectoryAsJson(AgentTrajectory trajectory, boolean prettyPrint) {
JsonFormat.Printer printer = JsonFormat.printer();
if (prettyPrint) {
printer = printer.includingDefaultValueFields();
}
return printer.print(trajectory);
}

Firestore Index (for searching):

Collection: agent_trajectories
Document ID: {trajectory_id}
Fields:
- workflow_name: "page_explanation"
- project_id: "R2024.0091"
- file_id: "1"
- page_number: 3
- started_at: timestamp
- total_duration_ms: 105000
- total_turns: 3
- iterations_completed: 1
- final_quality_score: 0.85
- total_cost_usd: 0.038
- gcs_path: "projects/.../traces/..."

CLI Tool: Export Trajectory​

#!/bin/bash
# cli/codeproof.sh export-trajectory

TRAJECTORY_ID=$1
OUTPUT_FILE=${2:-trajectory.json}

# Call gRPC API
grpcurl -d '{
"trajectory_id": "'$TRAJECTORY_ID'",
"include_full_llm_traces": true,
"pretty_print": true
}' \
localhost:8080 \
PageExplanationService/ExportAgentTrajectory \
| jq '.trajectory_json' -r > $OUTPUT_FILE

echo "Trajectory exported to: $OUTPUT_FILE"

Future: Trajectory Visualization UI​

Component: TrajectoryViewer (Angular)

Features:

  • Timeline view of all turns
  • Expandable sections for each iteration
  • Diff view between draft versions
  • Cost breakdown visualization
  • Quality score progression chart
  • Search/filter by phase, model, cost

Mockup:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent Trajectory: page_explanation β”‚
β”‚ Project: R2024.0091 | Page: 3 | File: 1 β”‚
β”‚ Duration: 1m 45s | Cost: $0.038 | Quality: 0.85β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ πŸ“Š Quality Score: [β–“β–“β–“β–“β–“β–“β–“β–“β–“β–‘] 85% β”‚
β”‚ πŸ’° Cost Breakdown: β”‚
β”‚ β”œβ”€ Gemini Pro (2 turns): $0.038 β”‚
β”‚ └─ Gemini Flash (1 turn): $0.0004 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Iteration 1 β–Ό β”‚
β”‚ β”œβ”€ Turn 1 (GENERATE) - Gemini Pro β”‚
β”‚ β”‚ Input: 7500 tokens | Output: 2000 β”‚
β”‚ β”‚ Cost: $0.019 | Time: 35s β”‚
β”‚ β”‚ πŸ“„ View Prompt | View Response β”‚
β”‚ β”œβ”€ Turn 2 (REFLECT) - Gemini Flash β”‚
β”‚ β”‚ Quality: 0.75 | Gaps: 3 identified β”‚
β”‚ β”‚ Cost: $0.0004 | Time: 5s β”‚
β”‚ β”‚ πŸ“Š View Reflection JSON β”‚
β”‚ └─ Turn 3 (REFINE) - Gemini Pro β”‚
β”‚ Cached: 5000 tokens | Cost: $0.019 β”‚
β”‚ Quality: 0.85 βœ“ Threshold met β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Troubleshooting​

Issue: High Token Costs​

Symptoms: Token usage exceeds budget, costs higher than expected

Solutions:

  1. Verify prompt caching is enabled and working
  2. Check cache hit rate in logs
  3. Use cheaper models for reflection turns (other efficient models)
  4. Reduce max_phases_completed to 2 instead of 3

Issue: Slow Generation​

Symptoms: Takes >5 minutes per page

Solutions:

  1. Check LLM API latency
  2. Reduce image resolution for PDF
  3. Use async processing (don't block user)
  4. Optimize prompts to reduce output tokens

Issue: Poor Quality Output​

Symptoms: Markdown is not educational or contains errors

Solutions:

  1. Review and improve prompt templates
  2. Increase max_phases_completed to 4 or 5
  3. Add example outputs to prompts
  4. Use higher temperature for creativity
  5. Collect human feedback for prompt tuning

References​