TDD: Architectural Plan Page Explanation & Educational Details

📋 Product Requirements: Plan Page Explanation PRD
📋 Implementation Issue: Issue #258 - AI-Powered Plan Page Explanation with Agentic Workflow

Overview

This Technical Design Document details the implementation of AI-powered page explanation generation that transforms raw LLM-extracted text and PDF images into comprehensive, professional explanation markdown. The system uses an agentic multi-turn workflow (ReAct pattern) to iteratively refine explanations through self-reflection.

Architecture Overview

System Components

┌────────────────────────────────────────────────────────────────┐
│                    Frontend (Angular)                          │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ PageViewerComponent                                      │  │
│  │  ├─ [Overview] [Preview] [Compliance] [Details] ← NEW    │  │
│  │  └─ DetailsTabComponent ← NEW                            │  │
│  └────────────┬─────────────────────────────────────────────┘  │
│               │ gRPC-Web                                       │
└───────────────┼────────────────────────────────────────────────┘
                │
┌───────────────┼────────────────────────────────────────────────┐
│               ▼         Backend (Java/Spring)                  │
│  ┌────────────────────────────────────────────────────────┐    │
│  │   ArchitecturalPlanService (Facade)                    │    │
│  │   └─ getArchitecturalPlanPage(...) ← Enhanced          │    │
│  └────────┬──────────────────────────────────┬────────────┘    │
│           │                                  │                 │
│  ┌────────▼──────────────┐      ┌───────────▼──────────────┐   │
│  │ PageExplanation     │      │ Agentic Workflow         │   │
│  │ Service (NEW)         │      │ Engine (NEW)             │   │
│  │ ├─ generate()         │      │ ├─ AgenticPage           │   │
│  │ ├─ get()              │      │ │   Interpreter          │   │
│  │ └─ regenerate()       │      │ └─ IterativeRefiner      │   │
│  └────────┬──────────────┘      └────────────┬─────────────┘   │
│           │                                  │                 │
│  ┌────────▼──────────────────────────────────▼──────────────┐  │
│  │       LLM Integration Layer                              │  │
│  │       (Vertex AI - Gemini Models)                        │  │
│  │       + Prompt Caching Manager                           │  │
│  └────────┬─────────────────────────────────────────────────┘  │
│           │                                                    │
└───────────┼────────────────────────────────────────────────────┘
            │
┌───────────▼────────────────────────────────────────────────────┐
│               Cloud Storage (GCS)                              │
│  projects/{projectId}/files/{file_id}/pages/{pageNumber}/      │
│  ├── page.pdf                          ← INPUT (cached)        │
│  ├── page.md                            ← INPUT                │
│  ├── page-explanation.md              ← OUTPUT (NEW)           │
│  └── metadata.json                      ← UPDATED              │
└────────────────────────────────────────────────────────────────┘

Data Flow: Page Understanding Generation

User/System Request
    │
    ▼
┌────────────────────────────────────────┐
│ PageExplanationService                 │
│ .generatePageExplanation(...)          │
└────────┬───────────────────────────────┘
         │
         ▼
┌────────────────────────────────────────┐
│ Load Page Context                      │
│ ├─ Read page.md                        │
│ ├─ Read page.pdf                       │
│ └─ Read metadata.json                  │
└────────┬───────────────────────────────┘
         │
         ▼
┌────────────────────────────────────────┐
│ AgenticPageInterpreter                 │
│ .explainPage(pageContext)            │
└────────┬───────────────────────────────┘
         │
         ├──────────────────────────────────────────┐
         │                                          │
         ▼                                          ▼
    ┌────────────────────┐              ┌────────────────────┐
    │ TURN 1: Generate   │              │ Prompt Caching     │
    │ Initial Draft      │◄─────────────┤ Manager            │
    │                    │              │ (Cache PDF)        │
    │ Input:             │              └────────────────────┘
    │ - page.pdf (img)   │
    │ - page.md (text)   │
    │ - generation prompt│
    │                    │
    │ Output:            │
    │ - draft-v1.md      │
    └────────┬───────────┘
             │
             ▼
    ┌────────────────────┐
    │ TURN 2: Reflect    │
    │ on Draft Quality   │
    │                    │
    │ Input:             │
    │ - draft-v1.md      │
    │ - reflection prompt│
    │                    │
    │ Output:            │
    │ - reflection.json  │
    │   (gaps, issues)   │
    └────────┬───────────┘
             │
             ▼
    ┌────────────────────┐
    │ TURN 3: Refine     │
    │ with Reflection    │
    │                    │
    │ Input:             │
    │ - page.pdf (cached)│
    │ - draft-v1.md      │
    │ - reflection.json  │
    │ - refinement prompt│
    │                    │
    │ Output:            │
    │ - page-understanding│
    │   .md (FINAL)      │
    └────────┬───────────┘
             │
             ▼
    ┌────────────────────┐
    │ Optional: Additional│
    │ Iterations         │
    │ (If max_iter > 3)  │
    └────────┬───────────┘
             │
             ▼
┌────────────────────────────────────────┐
│ Save Artifacts                         │
│ ├─ Write page-explanation.md         │
│ ├─ Update metadata.json                │
│ └─ Log generation metrics              │
└────────────────────────────────────────┘

Proto Message Definitions

Import Existing Cost Analysis

import "cost_analysis.proto";
import "google/protobuf/timestamp.proto";

New Messages for Page Interpretation

syntax = "proto3";

package codetricks.construction.api;

import "google/protobuf/timestamp.proto";
import "cost_analysis.proto";

// ============================================================================
// PAGE INTERPRETATION - Request/Response Messages
// ============================================================================

// Request to generate AI-powered professional explanation for a plan page
message GeneratePageExplanationRequest {
  // Project identification
  string project_id = 1;
  string file_id = 2;
  int32 page_number = 3;
  
  // Processing options
  bool force_regenerate = 4;            // Regenerate even if already exists
  int32 max_phases_completed = 5;             // Max agentic turns (default: 3)
  bool verbose_logging = 6;             // Log prompts and responses
  
  // Model configuration (optional, uses defaults if not specified)
  // Multi-model strategy enables cost optimization:
  // - Use premium models (Gemini Pro) for quality-critical tasks (Generation, Refinement)
  // - Use efficient models (Gemini Flash) for analytical tasks (Reflection, Scoring)
  // - Gemini Flash is 50-100x cheaper than Pro with minimal quality impact for reflection
  string primary_model = 7;             // Primary model for generation/refinement (default: "gemini-2.5-pro-latest")
  string reflection_model = 8;          // Model for reflection/analysis (default: same as primary, or "gemini-2.0-flash-exp" for 50% cost savings)
  bool enable_caching = 9;              // Use prompt caching (default: true)
  
  // Advanced: Per-turn model override for experimentation
  map<string, string> turn_models = 10; // Turn type → model (e.g., {"REFLECT": "gemini-flash", "GENERATE": "gemini-pro"})
}

message GeneratePageExplanationResponse {
  bool success = 1;
  string status_message = 2;
  
  // Metadata about generated explanation
  PageExplanationMetadata metadata = 3;
  
  // Performance metrics (reuses existing CostAnalysisMetadata)
  CostAnalysisMetadata cost_analysis = 4;
  int32 processing_time_seconds = 5;
  
  // Error details (if success = false)
  string error_code = 6;
  string error_details = 7;
}

// Metadata about page explanation generation
message PageExplanationMetadata {
  // Generation status
  string status = 1;                    // "pending", "processing", "completed", "failed"
  google.protobuf.Timestamp generated_at = 2;
  
  // Model tracking (multi-model support for cost optimization)
  // Phase 1: Single model (primary_model only)
  // Phase 2: Multi-model (different models for different turn types)
  string primary_model = 3;             // Main model used (e.g., "gemini-2.5-pro-latest")
  map<string, int32> models_used = 4;   // Model → turn count (e.g., {"gemini-2.5-pro": 2, "gemini-flash": 1})
  
  // Workflow tracking
  int32 iterations_completed = 5;       // Number of complete loop cycles
  int32 total_turns = 6;                // Total LLM API calls
  
  // Cost analysis (reuses existing CostAnalysisMetadata)
  // Provides comprehensive token tracking:
  // - Total tokens and estimated cost
  // - Detailed breakdown (non-cached input, cached content, output)
  // - Rate per million tokens
  // - Discount percentages for cached content
  // - Processing metadata (duration, caching efficiency)
  CostAnalysisMetadata cost_analysis = 5;
  
  // Output file
  string file_path = 6;                 // Relative path: "page-explanation.md"
  
  // Quality metrics (optional, for evaluation)
  float quality_score = 7;              // 0.0-1.0 (future: human evaluation)
}

Note: This reuses the existing CostAnalysisMetadata message from cost_analysis.proto (Issue #176) which is already used for task cost tracking. This ensures:

Consistency: Same cost tracking across all LLM operations
Richness: Comprehensive token breakdown with caching metrics
Integration: Works with existing Firestore task tracking
No Duplication: Avoids creating redundant proto messages

Multi-Model Support: The MetaCostAnalysis message (also in cost_analysis.proto) supports per-model cost breakdown for workflows that use multiple models:

message MetaCostAnalysis {
  map<string, CostAnalysisMetadata> model_costs = 1;  // Per-model breakdown
  double total_cost_usd = 2;                          // Aggregated total
  int32 total_tokens = 3;
  string primary_model = 4;
}

This is perfect for multi-model workflows where:

Turn 1 (Generate): Uses other premium models → tracked separately
Turn 2 (Reflect): Uses other efficient models → tracked separately
Turn 3 (Refine): Uses other premium models → tracked separately
Final: Aggregated cost shows total across all models

// ============================================================================
// EXISTING MESSAGE UPDATES
// ============================================================================

// Update to existing ArchitecturalPlanPage message
message ArchitecturalPlanPage {
  // ... existing fields (pageNumber, fileId, title, summary, etc.) ...
  
  // NEW: Rich understanding content
  string explanation_markdown = 20;   // Content from page-explanation.md
  PageExplanationMetadata explanation_metadata = 21;
  
  // Indicates if understanding is available
  bool has_understanding = 22;
}

// Update to existing GetArchitecturalPlanPageRequest
message GetArchitecturalPlanPageRequest {
  string project_id = 1;
  string file_id = 2;
  int32 page_number = 3;
  
  // NEW: Control whether to include understanding
  bool include_understanding = 4;       // Default: true
}

New gRPC Service Methods

service ArchitecturalPlanService {
  // ... existing methods ...
  
  // NEW: Generate page explanation
  rpc GeneratePageExplanation(GeneratePageExplanationRequest) 
    returns (GeneratePageExplanationResponse);
  
  // NEW: Get page explanation status
  rpc GetPageExplanationStatus(GetPageExplanationStatusRequest)
    returns (PageExplanationMetadata);
  
  // NEW: Batch generate for multiple pages
  rpc BatchGeneratePageExplanation(BatchGeneratePageExplanationRequest)
    returns (stream GeneratePageExplanationResponse);
}

message GetPageExplanationStatusRequest {
  string project_id = 1;
  string file_id = 2;
  int32 page_number = 3;
}

message BatchGeneratePageExplanationRequest {
  string project_id = 1;
  string file_id = 2;
  repeated int32 page_numbers = 3;      // Pages to process
  
  GeneratePageExplanationRequest options = 4;  // Shared options
}

Multi-Model Strategy (Detailed)

Model Selection by Task Type

The agentic workflow performs different types of tasks, each with different requirements and optimal model choices:

Task Type	Turn(s)	Requirements	Best Models	Cost/Turn	Quality Impact
Generation	1	Creative writing, comprehensive coverage, professional tone	Gemini 2.5 Pro, other premium models, GPT-4	~$0.02	Critical - use premium
Reflection	2, 4, 6...	Analytical assessment, gap identification, scoring	Gemini Flash, other efficient models	~$0.0004	Minimal - use efficient
Refinement	3, 5, 7...	Creative improvement, tone consistency, gap filling	Gemini 2.5 Pro (same as Turn 1)	~$0.02	Critical - use premium
Orchestration	N/A	Quality threshold checks, iteration decisions	Gemini Flash	~$0.0001	None - use fastest

Cost Optimization Examples

Example 1: Single-Model (Phase 1 Implementation)

PageExplanationConfig config = PageExplanationConfig.builder()
    .primaryModel("gemini-2.5-pro")
    .reflectionModel(null)  // Use primary for all turns
    .maxIterations(1)
    .build();

// Iteration 1:
// - Turn 1 (Generate): Gemini Pro → $0.019
// - Turn 2 (Reflect): Gemini Pro → $0.003
// - Turn 3 (Refine): Gemini Pro → $0.019
// Total: $0.04/page

Example 2: Multi-Model (Phase 2 Optimization)

PageExplanationConfig config = PageExplanationConfig.builder()
    .primaryModel("gemini-2.5-pro")     // For Generate/Refine
    .reflectionModel("gemini-2.5-flash")     // For Reflect (40% cheaper)
    .maxIterations(1)
    .build();

// Iteration 1:
// - Turn 1 (Generate): Gemini Pro → $0.019
// - Turn 2 (Reflect): Gemini Flash → $0.0004 (50x cheaper!)
// - Turn 3 (Refine): Gemini Pro → $0.019
// Total: $0.02/page (50% savings!)

Example 3: Dynamic Model Selection (Advanced)

// Use cheap model for simple pages, premium for complex ones
PageExplanationConfig config = PageExplanationConfig.builder()
    .primaryModel(pageComplexity > 0.7 
        ? "gemini-2.5-pro"    // Complex: Use premium
        : "gemini-2.5-flash")       // Simple: Use efficient
    .reflectionModel("gemini-2.5-flash")  // Always cheap for reflection
    .maxIterations(pageComplexity > 0.7 ? 2 : 1)  // More iterations for complex pages
    .build();

Model Capabilities Matrix

Model	Input Cost	Output Cost	Cached Cost	Strengths	Best For
Gemini 2.5 Pro ⭐	$1.25/1M	$5.00/1M	$0.315/1M	Excellent quality, best cost/perf ratio	Generation, Refinement
Gemini Flash ⭐	$0.075/1M	$0.30/1M	$0.01875/1M	Extremely fast and cheap, good analysis	Reflection, Simple pages
other premium models	$3.00/1M	$15.00/1M	$0.30/1M	Superior creative writing	Complex creative tasks
other efficient models	$0.25/1M	$1.25/1M	$0.03/1M	Fast, cost-effective	Reflection, Scoring
GPT-4 Turbo	$10.00/1M	$30.00/1M	N/A	Superior reasoning	Very complex pages only

⭐ Recommended: Gemini 2.5 Pro + Flash combination offers the best balance of quality and cost.

Turn-Specific Model Recommendations

Turn 1 (Initial Generation):

Recommended: Gemini 2.5 Pro ⭐
Why: First impression sets tone and structure, quality-critical
Advantages: Excellent quality, multimodal, great cost/perf ratio ($1.25/1M input)
Avoid: Gemini Flash for initial generation - quality matters more than speed here

Turn 2, 4, 6... (Reflection):

Recommended: Gemini Flash ⭐
Why: Analytical task, structured output (JSON), minimal creativity needed
Advantages: Extremely cheap ($0.075/1M vs $1.25/1M for Pro) = 95% savings
Quality Impact: Minimal - reflection is analysis, not creative writing

Turn 3, 5, 7... (Refinement):

Recommended: Gemini 2.5 Pro ⭐ (same as Turn 1)
Why: Must maintain consistent tone, style, and quality
Avoid: Switching between Pro and Flash for generation tasks - causes style inconsistency

ADK (Agent Development Kit) Integration

Phase 1 MVP: Pragmatic 3-Tool Implementation

Goal: Ship working feature fast, design for future expansion

Philosophy: Start simple, build sophisticated

3 core tools (not 11)
1 agent (not 8)
Sequential execution (parallel later)
Full observability from day 1
Extensible architecture (add tools incrementally)

Timeline: 1 week to working feature

Overview

This feature will be implemented using Google's Agent Development Kit (ADK) for Java to leverage proven agent orchestration patterns already established in the codebase.

Maven Dependency:

<dependency>
  <groupId>com.google.adk</groupId>
  <artifactId>google-adk</artifactId>
  <version>0.3.0</version>
</dependency>

Related:

ADK Java Source Reference: github/adk-java/ (downloaded for source code reference only, not used directly)
Maven Central: google-adk
Documentation: ADK Java Docs
Issue #257: Custom OpenAPI Toolset and ADK integration
Existing Usage: ArchitecturalPlanReviewAgent.java, MultiToolAgent.java

Note: We use the Maven dependency for the actual implementation. The github/adk-java/ folder is downloaded for source code reference and documentation purposes only.

Why ADK for Page Explanation?

Proven Framework: Already used successfully in ArchitecturalPlanReviewAgent for building code compliance
Multi-Turn Support: Native support for iterative workflows (Generate→Reflect→Refine)
Tool Integration: FunctionTool for custom methods, easy reflection/refinement orchestration
State Management: Built-in session and memory management for multi-turn conversations
Gemini Native: Designed for Gemini models with optimal integration
Event Streaming: RxJava Flowable<Event> for reactive progress tracking
Callbacks: Before/after hooks for logging, cost tracking, quality scoring

Phase 1 MVP: 3-Tool Architecture

PageExplanationAgent (LlmAgent)
  ├─ Model: gemini-2.5-pro-latest (primary)
  ├─ Temperature: 0.0 (maximum consistency)
  ├─ Instruction: "Orchestrate Generate → Assess → Refine workflow"
  │
  ├─ Tools (Phase 1 - Core 3):
  │  ├─ generateExplanation() - Uses Pro, creates initial draft
  │  ├─ assessQuality() - Uses Flash, returns `{score, confidence, gaps}`
  │  └─ refineExplanation() - Uses Pro, improves draft
  │
  ├─ Tools (Phase 1.5 - Easy Additions):
  │  ├─ extractKeyInsights() - Flash, understand page first
  │  └─ checkCompleteness() - Flash, validate coverage
  │  (Just add FunctionTool, no architecture change)
  │
  ├─ Tools (Phase 2+ - Future):
  │  ├─ searchBuildingCodes() - Flash + RAG
  │  ├─ analyzeVisualElements() - Pro, multimodal
  │  ├─ validateTechnicalTerms() - Flash
  │  └─ findRelatedPages() - Flash
  │  (Add as needed, architecture supports it)
  │
  ├─ Callbacks (Observability):
  │  ├─ afterModelCallback - TrajectoryTrackingCallback
  │  └─ afterToolCallback - CostTrackingCallback
  │
  └─ Session Management: InMemorySessionService

Extensibility Pattern: Tools array is the only change needed to add features!

ADK Implementation Pattern

Following the established pattern from ArchitecturalPlanReviewAgent:

// Similar to ArchitecturalPlanReviewAgent.java
public class PageExplanationAgent {
    
    // Public static for ADK Dev UI compatibility
    public static BaseAgent ROOT_AGENT = initAgent();
    
    public static BaseAgent initAgent() {
        return LlmAgent.builder()
            .name("page_explanation_agent")
            .model("gemini-2.5-pro-latest")  // Primary model
            .generateContentConfig(
                GenerateContentConfig.builder()
                    .temperature(0.0F)  // Maximum predictability and consistency
                    .build())
            .description("Generates professional explanations of architectural plan pages")
            .instruction(AGENT_INSTRUCTION)
            .tools(
                FunctionTool.create(PageExplanationAgent.class, "generateInitialDraft"),
                FunctionTool.create(PageExplanationAgent.class, "reflectOnQuality"),
                FunctionTool.create(PageExplanationAgent.class, "refineExplanation")
            )
            .afterModelCallback(new CostTrackingCallback())
            .build();
    }
}

Multi-Model Support with ADK

ADK doesn't natively support per-turn model switching, but we can implement it using custom tools with embedded model calls:

public class PageExplanationTools {
    
    private final VertexAiClient vertexAi;
    
    /**
     * Tool for reflection using efficient Gemini Flash model.
     * This bypasses the agent's primary model to use a cheaper model.
     */
    public ReflectionResult reflectOnQuality(
            @Schema(description = "The draft explanation to review") String draftMarkdown) {
        
        // Use Gemini Flash for cost savings (not the agent's primary Gemini Pro model)
        GenerativeModel flashModel = new GenerativeModel.Builder()
            .setModelName("gemini-2.0-flash-exp")
            .setVertexAi(vertexAi)
            .build();
        
        String reflectionPrompt = buildReflectionPrompt(draftMarkdown);
        GenerateContentResponse response = flashModel.generateContent(reflectionPrompt);
        
        // Parse reflection JSON response
        return ReflectionResult.fromJson(response.getText());
    }
}

Iterative Workflow with ADK

The ADK agent naturally supports our Generate→Reflect→Refine workflow:

Iteration 1:

Turn 1: Agent calls generateInitialDraft() tool
Turn 2: Agent calls reflectOnQuality() tool (uses Flash model internally)
Turn 3: Agent calls refineExplanation() tool with reflection results

Iteration 2+ (if quality < threshold): 4. Turn 4: Agent calls reflectOnQuality() again 5. Turn 5: Agent calls refineExplanation() again

The agent decides when to stop based on:

Quality score from reflection
Max iterations reached
Instruction-based stopping criteria

ADK Callbacks for Tracking

public class CostTrackingCallback implements AfterModelCallbackSync {
    
    private final CostAnalysisBuilder costBuilder = new CostAnalysisBuilder();
    
    @Override
    public Maybe<Content> call(CallbackContext callbackContext) {
        // Extract token usage from Gemini response
        UsageMetadata usage = callbackContext.modelResponse().usageMetadata();
        
        // Track per-model costs
        costBuilder.addTurn(
            callbackContext.invocationContext().getAgent().model(),
            usage.getPromptTokenCount(),
            usage.getCandidatesTokenCount(),
            usage.getCachedContentTokenCount()
        );
        
        // Log turn completion
        logger.info("Turn {}: {} tokens ({}  cached)",
            costBuilder.getTurnCount(),
            usage.getTotalTokenCount(),
            usage.getCachedContentTokenCount());
        
        return Maybe.empty();  // Don't modify content
    }
}

Comparison: ADK vs Custom Workflow

Aspect	Custom Implementation	ADK Implementation
Agent Loop	Manual orchestration	Built-in `AutoFlow`
Tool Calling	Custom logic	Native `FunctionTool`
Model Switching	Direct API calls	Tools with embedded models
State Management	Manual tracking	`SessionService` + `MemoryService`
Event Streaming	Custom events	RxJava `Flowable<Event>`
Cost Tracking	Custom	Callbacks + `UsageMetadata`
Testing	Custom harness	ADK Dev UI
Debugging	Logs only	Dev UI + Event traces

Maven Configuration

Add to pom.xml:

<dependencies>
    <!-- ADK Core - For agent orchestration -->
    <dependency>
        <groupId>com.google.adk</groupId>
        <artifactId>google-adk</artifactId>
        <version>0.3.0</version>
    </dependency>
    
    <!-- ADK Dev UI - For local testing (optional) -->
    <dependency>
        <groupId>com.google.adk</groupId>
        <artifactId>google-adk-dev</artifactId>
        <version>0.3.0</version>
        <scope>provided</scope>
    </dependency>
    
    <!-- Vertex AI SDK - For Gemini models -->
    <dependency>
        <groupId>com.google.cloud</groupId>
        <artifactId>google-cloud-aiplatform</artifactId>
        <version>3.x.x</version>
    </dependency>
</dependencies>

Build Command:

export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64
mvn clean install

ADK Best Practices (from existing code)

Based on ArchitecturalPlanReviewAgent and Issue #257:

Use FunctionTool for Custom Logic:

FunctionTool.create(PageExplanationAgent.class, "generateInitialDraft")

Leverage OpenAPI for External Services (if needed):

OpenApiToolset toolset = OpenApiToolset.builder()
    .addOpenApiSpecFromFile("openapi.yaml")
    .baseUrl("http://localhost:8082")
    .build();

Use GenerateContentConfig for Model Settings:

.generateContentConfig(
    GenerateContentConfig.builder()
        .temperature(0.0F)  // Maximum predictability
        .build())

Expose ROOT_AGENT for Dev UI:

public static BaseAgent ROOT_AGENT = initAgent();

Use InMemoryRunner for Execution:

InMemoryRunner runner = new InMemoryRunner(ROOT_AGENT);

Handle Events with RxJava:

Flowable<Event> events = runner.runAsync(userId, sessionId, userMsg);
events.blockingForEach(event -> processEvent(event));

Implementation Details

Backend Implementation

1. PageExplanationService

File: src/main/java/org/codetricks/construction/code/assistant/understanding/PageExplanationService.java

package org.codetricks.construction.code.assistant.understanding;

import com.google.protobuf.Timestamp;
import org.codetricks.construction.code.assistant.FileSystemHandler;
import org.codetricks.construction.code.assistant.proto.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Service;

import java.time.Instant;
import java.util.Optional;

/**
 * Service for generating and managing AI-powered page explanation.
 * 
 * This service orchestrates the agentic workflow that transforms raw plan page
 * content (OCR text + PDF) into comprehensive, educational markdown.
 * 
 * Related:
 * - PRD: docs/04-prd/plan-page-explanation.md
 * - TDD: docs/05-tdd/plan-page-explanation.md
 */
@Service
public class PageExplanationService {
    
    private static final Logger logger = LoggerFactory.getLogger(PageExplanationService.class);
    
    private final FileSystemHandler fileSystemHandler;
    private final ProjectPathResolver pathResolver;
    private final AgenticPageInterpreter agenticInterpreter;
    private final PromptCachingManager cachingManager;
    
    public PageExplanationService(
            FileSystemHandler fileSystemHandler,
            ProjectPathResolver pathResolver,
            AgenticPageInterpreter agenticInterpreter,
            PromptCachingManager cachingManager) {
        this.fileSystemHandler = fileSystemHandler;
        this.pathResolver = pathResolver;
        this.agenticInterpreter = agenticInterpreter;
        this.cachingManager = cachingManager;
    }
    
    /**
     * Generate page explanation with agentic workflow.
     */
    public GeneratePageExplanationResponse generatePageExplanation(
            GeneratePageExplanationRequest request) {
        
        String projectId = request.getProjectId();
        String fileId = request.getFileId();
        int pageNumber = request.getPageNumber();
        
        logger.info("Starting page explanation generation: project={}, file={}, page={}",
                projectId, fileId, pageNumber);
        
        try {
            // 1. Check if understanding already exists (unless force regenerate)
            if (!request.getForceRegenerate() && understandingExists(projectId, fileId, pageNumber)) {
                logger.info("Page understanding already exists, skipping generation");
                return GeneratePageExplanationResponse.newBuilder()
                        .setSuccess(true)
                        .setStatusMessage("Page understanding already exists")
                        .setMetadata(loadExistingMetadata(projectId, fileId, pageNumber))
                        .build();
            }
            
            // 2. Load page context (page.md, page.pdf, metadata.json)
            PageContext pageContext = loadPageContext(projectId, fileId, pageNumber);
            
            // 3. Update metadata to "processing" status
            updateMetadataStatus(projectId, fileId, pageNumber, "processing");
            
            // 4. Run agentic workflow
            long startTime = System.currentTimeMillis();
            
            AgenticInterpretationResult result = agenticInterpreter.explainPage(
                    pageContext,
                    request.getMaxIterations() > 0 ? request.getMaxIterations() : 3,
                    request.getModelName().isEmpty() ? null : request.getModelName(),
                    request.getEnableCaching()
            );
            
            long processingTimeMs = System.currentTimeMillis() - startTime;
            
            // 5. Save page-explanation.md
            String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
            String explanationPath = pageFolderPath + "/page-explanation.md";
            fileSystemHandler.writeFile(explanationPath, result.getFinalMarkdown());
            
            // 6. Update metadata with results
            PageExplanationMetadata metadata = buildMetadata(
                    result, 
                    request.getPrimaryModel().isEmpty() ? "gemini-2.5-pro-latest" : request.getPrimaryModel(),
                    (int) (processingTimeMs / 1000)
            );
            saveMetadata(projectId, fileId, pageNumber, metadata);
            
            logger.info("Page understanding generated successfully: tokens={}, time={}s",
                    result.getTotalTokensUsed(), processingTimeMs / 1000);
            
            return GeneratePageExplanationResponse.newBuilder()
                    .setSuccess(true)
                    .setStatusMessage("Page understanding generated successfully")
                    .setMetadata(metadata)
                    .setTotalTokensUsed(result.getTotalTokensUsed())
                    .setProcessingTimeSeconds((int) (processingTimeMs / 1000))
                    .build();
            
        } catch (Exception e) {
            logger.error("Failed to generate page explanation", e);
            
            // Update metadata to "failed" status
            updateMetadataStatus(projectId, fileId, pageNumber, "failed");
            
            return GeneratePageExplanationResponse.newBuilder()
                    .setSuccess(false)
                    .setStatusMessage("Failed to generate page explanation")
                    .setErrorCode("GENERATION_FAILED")
                    .setErrorDetails(e.getMessage())
                    .build();
        }
    }
    
    /**
     * Get page explanation content (for Details tab).
     */
    public Optional<String> getPageExplanation(String projectId, String fileId, int pageNumber) 
            throws PageNotFoundException {
        String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
        String explanationPath = pageFolderPath + "/page-explanation.md";
        
        if (fileSystemHandler.exists(explanationPath)) {
            return Optional.of(fileSystemHandler.readFile(explanationPath));
        }
        return Optional.empty();
    }
    
    /**
     * Check if page explanation exists.
     */
    private boolean explanationExists(String projectId, String fileId, int pageNumber) 
            throws PageNotFoundException {
        String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
        String explanationPath = pageFolderPath + "/page-explanation.md";
        return fileSystemHandler.exists(explanationPath);
    }
    
    /**
     * Load page context (inputs for agentic workflow).
     * 
     * Uses ProjectPathResolver for consistent path resolution.
     */
    private PageContext loadPageContext(String projectId, String fileId, int pageNumber) 
            throws PageNotFoundException {
        // Use ProjectPathResolver for consistent path handling
        String pageFolderPath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);
        
        // Load page.md using ProjectPathResolver
        String pageMarkdownPath = pathResolver.getPageMarkdownPath(projectId, pageNumber, fileId);
        String pageMarkdown = fileSystemHandler.readFile(pageMarkdownPath);
        
        // Load page.pdf using ProjectPathResolver
        String pagePdfPath = pathResolver.getPagePdfPath(projectId, pageNumber, fileId);
        byte[] pagePdfBytes = fileSystemHandler.readFileBytes(pagePdfPath);
        
        // Load metadata.json using ProjectPathResolver
        String metadataPath = pathResolver.getPageMetadataPath(projectId, pageNumber, fileId);
        String metadataJson = fileSystemHandler.exists(metadataPath)
                ? fileSystemHandler.readFile(metadataPath)
                : "{}";
        
        return PageContext.builder()
                .projectId(projectId)
                .fileId(fileId)
                .pageNumber(pageNumber)
                .pageMarkdown(pageMarkdown)
                .pagePdfBytes(pagePdfBytes)
                .metadataJson(metadataJson)
                .build();
    }
    
    /**
     * Build metadata from agentic result.
     */
    private PageExplanationMetadata buildMetadata(
            AgenticInterpretationResult result,
            String modelName,
            int processingTimeSeconds) {
        
        return PageExplanationMetadata.newBuilder()
                .setStatus("completed")
                .setGeneratedAt(Timestamp.newBuilder()
                        .setSeconds(Instant.now().getEpochSecond())
                        .build())
                .setModel(modelName)
                .setIterations(result.getIterationCount())
                .setTokensUsed(TokenUsage.newBuilder()
                        .setInput(result.getTotalInputTokens())
                        .setOutput(result.getTotalOutputTokens())
                        .setCached(result.getTotalCachedTokens())
                        .build())
                .setFilePath("page-explanation.md")
                .build();
    }
    
    /**
     * Save metadata to metadata.json.
     */
    private void saveMetadata(String projectId, String fileId, int pageNumber, 
                             PageExplanationMetadata metadata) throws PageNotFoundException {
        String metadataPath = pathResolver.getPageMetadataPath(projectId, pageNumber, fileId);
        
        // Read existing metadata, update understanding section, write back
        // (Implementation uses JSON merging logic - omitted for brevity)
        
        logger.info("Saved page explanation metadata: {}", metadataPath);
    }
    
    /**
     * Update metadata status only.
     */
    private void updateMetadataStatus(String projectId, String fileId, int pageNumber, String status) {
        // Similar to saveMetadata but only updates status field
        logger.info("Updated page explanation status to: {}", status);
    }
    
    /**
     * Load existing metadata (if already generated).
     */
    private PageExplanationMetadata loadExistingMetadata(String projectId, String fileId, int pageNumber) {
        // Load from metadata.json and parse understanding section
        // (Implementation omitted for brevity)
        return PageExplanationMetadata.newBuilder()
                .setStatus("completed")
                .build();
    }
}

2. AgenticPageInterpreter (Agentic Workflow Engine)

File: src/main/java/org/codetricks/construction/code/assistant/understanding/AgenticPageInterpreter.java

package org.codetricks.construction.code.assistant.understanding;

import org.codetricks.construction.code.assistant.llm.LLMClient;
import org.codetricks.construction.code.assistant.llm.PromptCachingManager;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;

import java.util.ArrayList;
import java.util.List;

/**
 * Agentic workflow engine for iterative page explanation generation.
 * 
 * Implements a multi-turn ReAct-inspired loop:
 * 1. Generate initial understanding draft
 * 2. Reflect on draft quality (identify gaps)
 * 3. Refine draft with reflection insights
 * 
 * Uses prompt caching for PDF images across turns to reduce costs.
 */
@Component
public class AgenticPageInterpreter {
    
    private static final Logger logger = LoggerFactory.getLogger(AgenticPageInterpreter.class);
    
    private final LLMClient llmClient;
    private final PromptCachingManager cachingManager;
    private final PromptTemplateLoader promptLoader;
    
    public AgenticPageInterpreter(
            LLMClient llmClient,
            PromptCachingManager cachingManager,
            PromptTemplateLoader promptLoader) {
        this.llmClient = llmClient;
        this.cachingManager = cachingManager;
        this.promptLoader = promptLoader;
    }
    
    /**
     * Interpret architectural plan page with multi-turn agentic workflow.
     * 
     * @param pageContext Input context (page.md, page.pdf, metadata)
     * @param maxIterations Maximum number of turns (default: 3)
     * @param maxIterations Maximum number of complete workflow cycles (default: 1)
     * @param enableCaching Use prompt caching for PDF (default: true)
     * @return Final understanding markdown and metrics
     */
    public AgenticInterpretationResult explainPage(
            PageContext pageContext,
            int maxIterations,
            String modelName,
            boolean enableCaching) {
        
        logger.info("Starting agentic explanation: project={}, file={}, page={}, maxIter={}",
                pageContext.getProjectId(), pageContext.getFileId(), 
                pageContext.getPageNumber(), maxIterations);
        
        // Initialize result tracking
        AgenticInterpretationResult.Builder resultBuilder = AgenticInterpretationResult.builder();
        List<CostAnalysisMetadata> turnCostsList = new ArrayList<>();
        
        String currentDraft = null;
        String reflectionNotes = null;
        
        try {
            // TURN 1: Generate initial understanding
            logger.info("TURN 1: Generating initial explanation draft");
            TurnResult turn1 = generateInitialDraft(pageContext, primaryModel, enableCaching);
            currentDraft = turn1.getOutput();
            turnCostsList.add(turn1.getCostAnalysis());
            
            logger.info("TURN 1 completed: {} tokens ({} cached), cost: ${}", 
                    turn1.getCostAnalysis().getTotalTokens(), 
                    turn1.getCostAnalysis().getTokenBreakdown().getCachedContent().getTokenCount(),
                    turn1.getCostAnalysis().getEstimatedTotalCostUsd());
            
            // If max phases_completed is 1, skip reflection and refinement
            if (maxIterations <= 1) {
                logger.info("Max phases_completed = 1, returning initial draft");
                return buildFinalResult(currentDraft, turnMetricsList);
            }
            
            // TURN 2: Reflect on draft quality
            logger.info("TURN 2: Reflecting on draft quality");
            TurnResult turn2 = reflectOnDraft(currentDraft, reflectionModel != null ? reflectionModel : primaryModel);
            reflectionNotes = turn2.getOutput();
            turnCostsList.add(turn2.getCostAnalysis());
            
            logger.info("TURN 2 completed: identified improvement areas, cost: ${}", 
                    turn2.getCostAnalysis().getEstimatedTotalCostUsd());
            
            // If max phases_completed is 2, return draft with reflection logged
            if (maxIterations <= 2) {
                logger.info("Max phases_completed = 2, returning draft after reflection");
                return buildFinalResult(currentDraft, turnMetricsList);
            }
            
            // TURN 3: Refine with reflection insights
            logger.info("TURN 3: Refining draft with reflection insights");
            TurnResult turn3 = refineWithReflection(
                    pageContext, currentDraft, reflectionNotes, primaryModel, enableCaching);
            currentDraft = turn3.getOutput();
            turnCostsList.add(turn3.getCostAnalysis());
            
            logger.info("TURN 3 completed: final explanation generated, cost: ${}", 
                    turn3.getCostAnalysis().getEstimatedTotalCostUsd());
            
            // Additional phases_completed (if maxIterations > 3)
            for (int i = 4; i <= maxIterations; i++) {
                logger.info("TURN {}: Additional refinement iteration", i);
                
                // Reflect again
                TurnResult reflectAgain = reflectOnDraft(currentDraft, modelName);
                reflectionNotes = reflectAgain.getOutput();
                turnMetricsList.add(reflectAgain.getMetrics());
                
                // Refine again
                TurnResult refineAgain = refineWithReflection(
                        pageContext, currentDraft, reflectionNotes, modelName, enableCaching);
                currentDraft = refineAgain.getOutput();
                turnMetricsList.add(refineAgain.getMetrics());
                
                logger.info("TURN {} completed", i);
            }
            
            return buildFinalResult(currentDraft, turnMetricsList);
            
        } catch (Exception e) {
            logger.error("Agentic explanation failed", e);
            throw new RuntimeException("Failed to interpret page", e);
        }
    }
    
    /**
     * TURN 1: Generate initial understanding draft.
     */
    private TurnResult generateInitialDraft(
            PageContext pageContext,
            String modelName,
            boolean enableCaching) {
        
        // Load prompt template
        String promptTemplate = promptLoader.loadTemplate("page-understanding-generate.txt");
        
        // Build prompt with page context
        String prompt = promptTemplate
                .replace("{{PAGE_MARKDOWN}}", pageContext.getPageMarkdown())
                .replace("{{PROJECT_ID}}", pageContext.getProjectId())
                .replace("{{FILE_ID}}", pageContext.getFileId())
                .replace("{{PAGE_NUMBER}}", String.valueOf(pageContext.getPageNumber()));
        
        // Prepare LLM request with PDF image
        // Uses PRIMARY model for quality-critical generation
        LLMRequest request = LLMRequest.builder()
                .modelName(primaryModel)
                .systemPrompt(promptLoader.loadTemplate("page-understanding-system.txt"))
                .userPrompt(prompt)
                .imageBytes(pageContext.getPagePdfBytes())
                .imageMediaType("application/pdf")
                .enableCaching(enableCaching)
                .cacheImageMarker(true)  // Mark PDF for caching
                .maxTokens(4000)
                .temperature(0.0)  // Maximum predictability and consistency
                .build();
        
        // Call LLM
        LLMResponse response = llmClient.generate(request);
        
        // Return result with CostAnalysisMetadata
        return TurnResult.builder()
                .turnNumber(1)
                .output(response.getContent())
                .costAnalysis(response.getCostAnalysisMetadata())  // From LLM response
                .build();
    }
    
    /**
     * TURN 2: Reflect on draft quality.
     * 
     * MODEL SELECTION: Uses REFLECTION model (cost-optimizable)
     * - This turn performs analytical quality assessment
     * - Requires: Structured analysis, gap identification, scoring
     * - Best models: other efficient models, Gemini Flash (efficient tier)
     * - Cost: ~$0.002/turn (10-20x cheaper than premium)
     * 
     * OPTIMIZATION OPPORTUNITY:
     * Using Gemini Flash instead of Pro for reflection saves ~50% on total cost
     * with minimal quality impact (reflection is analytical, not creative).
     */
    private TurnResult reflectOnDraft(String draft, String reflectionModel) {
        
        // Load reflection prompt
        String promptTemplate = promptLoader.loadTemplate("page-understanding-reflect.txt");
        String prompt = promptTemplate.replace("{{DRAFT_MARKDOWN}}", draft);
        
        // Prepare LLM request (no image needed for reflection)
        // Uses REFLECTION model (can be cheaper than primary for cost savings)
        LLMRequest request = LLMRequest.builder()
                .modelName(reflectionModel)
                .systemPrompt(promptLoader.loadTemplate("page-understanding-reflect-system.txt"))
                .userPrompt(prompt)
                .maxTokens(2000)
                .temperature(0.0)  // Consistent temperature for predictability
                .build();
        
        // Call LLM
        LLMResponse response = llmClient.generate(request);
        
        // Return result with CostAnalysisMetadata
        return TurnResult.builder()
                .turnNumber(2)
                .output(response.getContent())
                .costAnalysis(response.getCostAnalysisMetadata())  // From LLM response
                .build();
    }
    
    /**
     * TURN 3+: Refine draft with reflection insights.
     * 
     * MODEL SELECTION: Uses PRIMARY model (quality-critical)
     * - This turn improves explanation based on reflection feedback
     * - Requires: Creative refinement, maintaining tone, addressing gaps
     * - Best models: other premium models, GPT-4 (premium tier)
     * - Cost: ~$0.05/turn (expensive but essential for quality)
     * 
     * Note: Must use same model as Turn 1 (Generate) to maintain consistent
     * writing style, tone, and quality throughout the explanation.
     */
    private TurnResult refineWithReflection(
            PageContext pageContext,
            String draft,
            String reflectionNotes,
            String primaryModel,
            boolean enableCaching) {
        
        // Load refinement prompt
        String promptTemplate = promptLoader.loadTemplate("page-understanding-refine.txt");
        String prompt = promptTemplate
                .replace("{{DRAFT_MARKDOWN}}", draft)
                .replace("{{REFLECTION_NOTES}}", reflectionNotes)
                .replace("{{PAGE_MARKDOWN}}", pageContext.getPageMarkdown());
        
        // Prepare LLM request with PDF image (reuse cache)
        // Uses PRIMARY model for quality-critical refinement
        LLMRequest request = LLMRequest.builder()
                .modelName(primaryModel)
                .systemPrompt(promptLoader.loadTemplate("page-understanding-system.txt"))
                .userPrompt(prompt)
                .imageBytes(pageContext.getPagePdfBytes())
                .imageMediaType("application/pdf")
                .enableCaching(enableCaching)
                .cacheImageMarker(true)  // Reuse cached PDF
                .maxTokens(5000)
                .temperature(0.0)  // Maximum predictability
                .build();
        
        // Call LLM
        LLMResponse response = llmClient.generate(request);
        
        // Return result with CostAnalysisMetadata
        return TurnResult.builder()
                .turnNumber(3)
                .output(response.getContent())
                .costAnalysis(response.getCostAnalysisMetadata())  // From LLM response
                .build();
    }
    
    /**
     * Build final result from turn cost analyses.
     * 
     * Aggregates CostAnalysisMetadata from all turns using MetaCostAnalysis
     * for proper per-model cost tracking.
     */
    private AgenticInterpretationResult buildFinalResult(
            String finalMarkdown,
            List<CostAnalysisMetadata> turnCostsList) {
        
        // Use MetaCostAnalysis to aggregate per-model costs
        MetaCostAnalysis.Builder metaBuilder = MetaCostAnalysis.newBuilder();
        
        double totalCost = 0.0;
        int totalTokens = 0;
        
        // Aggregate by model
        Map<String, CostAnalysisMetadata.Builder> modelCosts = new HashMap<>();
        
        for (CostAnalysisMetadata turnCost : turnCostsList) {
            String model = turnCost.getModel();
            totalCost += turnCost.getEstimatedTotalCostUsd();
            totalTokens += turnCost.getTotalTokens();
            
            // Merge into per-model aggregation
            // (Implementation omitted for brevity - would merge token breakdowns)
        }
        
        MetaCostAnalysis metaCost = metaBuilder
                .setTotalCostUsd(totalCost)
                .setTotalTokens(totalTokens)
                .setPrimaryModel(turnCostsList.get(0).getModel())
                .build();
        
        return AgenticInterpretationResult.builder()
                .finalMarkdown(finalMarkdown)
                .iterationCount(turnCostsList.size())
                .metaCostAnalysis(metaCost)
                .turnCosts(turnCostsList)
                .build();
    }
}

3. Prompt Templates

File: src/main/resources/prompts/page-understanding-generate.txt

You are an expert architectural plan interpreter. Your task is to generate a comprehensive, 
educational explanation of an architectural plan page that makes it accessible to beginners 
and non-industry experts.

# Input Context

**Project ID:** {{PROJECT_ID}}
**File ID:** {{FILE_ID}}
**Page Number:** {{PAGE_NUMBER}}

## Raw OCR Text

{{PAGE_MARKDOWN}}

## PDF Image

[Attached: Full-resolution PDF page image]

# Your Task

Generate a rich, educational markdown document that explains this plan page comprehensively. 
Your explanation should:

1. **Be Beginner-Friendly**: Use simple language, define technical terms inline, and explain 
   architectural concepts as if teaching someone new to the field.

2. **Be Comprehensive**: Cover all major elements visible on the page - drawings, tables, 
   legends, annotations, title blocks, etc.

3. **Explain Relationships**: Show how elements connect (e.g., how zoning requirements affect 
   building setbacks, how legends map to drawing symbols).

4. **Provide Context**: Explain what each section means in the broader context of construction 
   and building codes.

5. **Use Rich Markdown**: Structure with headings, lists, tables, blockquotes for definitions, 
   and emphasis where helpful.

6. **Include Visual Descriptions**: Describe what the drawings show, not just the text.

# Output Format

Generate markdown with the following structure:

```markdown
# [Page Title/Name]

## Overview
Brief introduction to what this page contains and its purpose.

## Key Information

### [Section 1]
Detailed explanation of the first major section...

**Technical Term**: Definition inline for beginners.

### [Section 2]
...

## Understanding the Drawings

Describe visual elements, symbols, and what they represent...

## Architectural Concepts Explained

Explain any complex concepts for beginners...

## Code Compliance Considerations

If relevant, explain how this relates to building codes...

## Summary

Recap the most important takeaways from this page.

Guidelines

Assume the reader has NO architectural background
Define ALL technical terms when first used
Use analogies when explaining complex concepts
Be thorough but not overwhelming
Focus on understanding, not just description
Make it educational and engaging

Generate the educational markdown now:

**File**: `src/main/resources/prompts/page-understanding-reflect.txt`

```text
You are a quality reviewer for educational architectural content. Your task is to review 
the following page explanation draft and identify areas for improvement.

# Draft to Review

{{DRAFT_MARKDOWN}}

# Your Task

Analyze this draft and identify:

1. **Gaps in Coverage**: What important elements from the page are missing or under-explained?

2. **Clarity Issues**: Where is the language unclear, too technical, or confusing for beginners?

3. **Missing Context**: Where could relationships between elements be explained better?

4. **Definition Gaps**: Are there technical terms that need inline definitions?

5. **Structure Issues**: Could the organization be improved for better readability?

6. **Educational Value**: Where could the content be more engaging or educational?

# Output Format

Provide your reflection as structured JSON:

```json
{
  "gaps": [
    "Missing explanation of X",
    "Section Y needs more detail on Z"
  ],
  "clarity_issues": [
    "Term 'ABC' is not defined",
    "Paragraph about DEF is too technical"
  ],
  "missing_context": [
    "Relationship between X and Y not explained",
    "How Z affects building design unclear"
  ],
  "structure_suggestions": [
    "Consider adding a subsection for X",
    "Reorder sections Y and Z for better flow"
  ],
  "overall_assessment": "Brief summary of draft quality and main improvement areas"
}

Generate your reflection now:

**File**: `src/main/resources/prompts/page-understanding-refine.txt`

```text
You are an expert architectural plan interpreter refining an educational explanation based 
on quality feedback.

# Original Draft

{{DRAFT_MARKDOWN}}

# Reflection and Improvement Areas

{{REFLECTION_NOTES}}

# Original Raw Content (for reference)

{{PAGE_MARKDOWN}}

## PDF Image (for reference)

[Attached: Full-resolution PDF page image]

# Your Task

Improve the draft by addressing the identified issues:

1. Fill gaps in coverage
2. Clarify unclear sections
3. Add missing context and relationships
4. Define missing technical terms inline
5. Improve structure if needed
6. Enhance educational value

# Guidelines

- Keep what works well in the original draft
- Focus improvements on the identified issues
- Maintain beginner-friendly language
- Ensure comprehensive coverage
- Make it engaging and educational

# Output Format

Generate the improved markdown (full document, not just changes):

```markdown
[Your improved, comprehensive page explanation here]

Generate the refined understanding now:

### Frontend Implementation

#### 1. Update PageViewerComponent

**File**: `web-ng-m3/src/app/components/page-viewer/page-viewer.component.ts`

```typescript
import { Component, OnInit, Input } from '@angular/core';
import { ArchitecturalPlanService } from '../../shared/architectural-plan.service';
import { ArchitecturalPlanPage, PageExplanationMetadata } from '../../shared/proto/api';

@Component({
  selector: 'app-page-viewer',
  templateUrl: './page-viewer.component.html',
  styleUrls: ['./page-viewer.component.scss']
})
export class PageViewerComponent implements OnInit {
  @Input() projectId!: string;
  @Input() fileId!: string;
  @Input() pageNumber!: number;
  
  // Tab state
  selectedTabIndex = 0;  // 0: Overview, 1: Preview, 2: Compliance, 3: Details (NEW)
  
  // Page data
  page?: ArchitecturalPlanPage;
  
  // Details tab state (NEW)
  understandingMarkdown?: string;
  understandingMetadata?: PageExplanationMetadata;
  understandingLoading = false;
  understandingError?: string;
  
  constructor(private planService: ArchitecturalPlanService) {}
  
  ngOnInit(): void {
    this.loadPage();
  }
  
  /**
   * Load page data (including understanding if available).
   */
  loadPage(): void {
    this.planService.getArchitecturalPlanPage(
      this.projectId,
      this.fileId,
      this.pageNumber,
      true  // include_understanding = true
    ).subscribe({
      next: (page) => {
        this.page = page;
        
        // Check if understanding is available
        if (page.hasUnderstanding && page.understandingMarkdown) {
          this.understandingMarkdown = page.understandingMarkdown;
          this.understandingMetadata = page.understandingMetadata;
        }
      },
      error: (err) => {
        console.error('Failed to load page', err);
      }
    });
  }
  
  /**
   * Handle Details tab selection (lazy load if needed).
   */
  onTabChange(tabIndex: number): void {
    this.selectedTabIndex = tabIndex;
    
    // If Details tab (index 3) and understanding not loaded yet
    if (tabIndex === 3 && !this.understandingMarkdown && !this.understandingError) {
      this.loadUnderstanding();
    }
  }
  
  /**
   * Load page explanation (triggered by Details tab selection).
   */
  loadUnderstanding(): void {
    // If metadata says it's processing, show loading state
    if (this.understandingMetadata?.status === 'processing') {
      this.understandingLoading = true;
      // Poll for completion (or use WebSocket for real-time updates)
      this.pollUnderstandingStatus();
      return;
    }
    
    // If metadata says it's failed, show error
    if (this.understandingMetadata?.status === 'failed') {
      this.understandingError = 'Failed to generate page explanation';
      return;
    }
    
    // Otherwise, trigger generation if not exists
    if (!this.page?.hasUnderstanding) {
      this.generateUnderstanding();
    }
  }
  
  /**
   * Trigger page explanation generation.
   */
  generateUnderstanding(): void {
    this.understandingLoading = true;
    this.understandingError = undefined;
    
    this.planService.generatePageExplanation(
      this.projectId,
      this.fileId,
      this.pageNumber
    ).subscribe({
      next: (response) => {
        if (response.success) {
          // Reload page to get understanding content
          this.loadPage();
        } else {
          this.understandingError = response.statusMessage;
        }
        this.understandingLoading = false;
      },
      error: (err) => {
        console.error('Failed to generate understanding', err);
        this.understandingError = 'Failed to generate page explanation';
        this.understandingLoading = false;
      }
    });
  }
  
  /**
   * Poll for understanding generation completion.
   */
  pollUnderstandingStatus(): void {
    const pollInterval = setInterval(() => {
      this.planService.getPageExplanationStatus(
        this.projectId,
        this.fileId,
        this.pageNumber
      ).subscribe({
        next: (metadata) => {
          if (metadata.status === 'completed') {
            clearInterval(pollInterval);
            this.loadPage();  // Reload to get content
            this.understandingLoading = false;
          } else if (metadata.status === 'failed') {
            clearInterval(pollInterval);
            this.understandingError = 'Failed to generate page explanation';
            this.understandingLoading = false;
          }
        },
        error: (err) => {
          console.error('Failed to poll status', err);
          clearInterval(pollInterval);
          this.understandingError = 'Failed to check generation status';
          this.understandingLoading = false;
        }
      });
    }, 5000);  // Poll every 5 seconds
  }
}

File: web-ng-m3/src/app/components/page-viewer/page-viewer.component.html

<mat-card class="page-viewer-card">
  <!-- Tab Group with NEW Details tab -->
  <mat-tab-group [(selectedIndex)]="selectedTabIndex" (selectedTabChange)="onTabChange($event.index)">
    
    <!-- Overview Tab (existing) -->
    <mat-tab label="Overview">
      <div class="tab-content">
        <app-page-overview [page]="page"></app-page-overview>
      </div>
    </mat-tab>
    
    <!-- Preview Tab (existing) -->
    <mat-tab label="Preview">
      <div class="tab-content">
        <app-page-preview [page]="page"></app-page-preview>
      </div>
    </mat-tab>
    
    <!-- Compliance Tab (existing) -->
    <mat-tab label="Compliance">
      <div class="tab-content">
        <app-page-compliance [page]="page"></app-page-compliance>
      </div>
    </mat-tab>
    
    <!-- Details Tab (NEW) -->
    <mat-tab label="Details">
      <div class="tab-content details-tab">
        
        <!-- Loading State -->
        <div *ngIf="understandingLoading" class="loading-state">
          <mat-spinner diameter="40"></mat-spinner>
          <p>Generating detailed page explanation...</p>
          <p class="loading-hint">This may take 1-2 minutes. AI is analyzing the page content.</p>
        </div>
        
        <!-- Error State -->
        <div *ngIf="understandingError && !understandingLoading" class="error-state">
          <mat-icon color="warn">error</mat-icon>
          <p>{{ understandingError }}</p>
          <button mat-raised-button color="primary" (click)="generateUnderstanding()">
            <mat-icon>refresh</mat-icon> Retry
          </button>
        </div>
        
        <!-- Content State -->
        <div *ngIf="understandingMarkdown && !understandingLoading" class="understanding-content">
          <!-- Metadata Banner -->
          <div class="metadata-banner">
            <mat-icon>auto_awesome</mat-icon>
            <span>AI-generated explanation</span>
            <span class="metadata-details">
              Generated {{ understandingMetadata?.generatedAt | date:'short' }} | 
              {{ understandingMetadata?.phases_completedCompleted }} phases_completed
            </span>
          </div>
          
          <!-- Markdown Content -->
          <markdown [data]="understandingMarkdown" class="markdown-content"></markdown>
        </div>
        
        <!-- Empty State (no understanding available, not generating) -->
        <div *ngIf="!understandingMarkdown && !understandingLoading && !understandingError" class="empty-state">
          <mat-icon>description</mat-icon>
          <h3>Details Not Yet Generated</h3>
          <p>AI-powered page explanation has not been generated for this page yet.</p>
          <button mat-raised-button color="primary" (click)="generateUnderstanding()">
            <mat-icon>auto_awesome</mat-icon> Generate Details
          </button>
        </div>
        
      </div>
    </mat-tab>
    
  </mat-tab-group>
</mat-card>

File: web-ng-m3/src/app/components/page-viewer/page-viewer.component.scss

.page-viewer-card {
  margin: 16px;
}

.tab-content {
  padding: 24px;
  min-height: 400px;
}

.details-tab {
  .loading-state,
  .error-state,
  .empty-state {
    display: flex;
    flex-direction: column;
    align-items: center;
    justify-content: center;
    min-height: 400px;
    text-align: center;
    
    mat-icon {
      font-size: 48px;
      width: 48px;
      height: 48px;
      margin-bottom: 16px;
    }
    
    p {
      margin: 8px 0;
      color: #666;
    }
    
    .loading-hint {
      font-size: 0.875rem;
      font-style: italic;
    }
    
    button {
      margin-top: 16px;
    }
  }
  
  .understanding-content {
    .metadata-banner {
      display: flex;
      align-items: center;
      gap: 8px;
      padding: 12px 16px;
      background-color: #e3f2fd;
      border-left: 4px solid #2196f3;
      margin-bottom: 24px;
      border-radius: 4px;
      
      mat-icon {
        color: #2196f3;
      }
      
      .metadata-details {
        margin-left: auto;
        font-size: 0.875rem;
        color: #666;
      }
    }
    
    .markdown-content {
      // Markdown styling
      font-size: 1rem;
      line-height: 1.6;
      
      h1, h2, h3, h4, h5, h6 {
        margin-top: 1.5em;
        margin-bottom: 0.5em;
        font-weight: 600;
      }
      
      h1 { font-size: 2rem; border-bottom: 2px solid #e0e0e0; padding-bottom: 0.3em; }
      h2 { font-size: 1.5rem; border-bottom: 1px solid #e0e0e0; padding-bottom: 0.3em; }
      h3 { font-size: 1.25rem; }
      h4 { font-size: 1.1rem; }
      
      p {
        margin-bottom: 1em;
      }
      
      ul, ol {
        margin-bottom: 1em;
        padding-left: 2em;
      }
      
      li {
        margin-bottom: 0.5em;
      }
      
      table {
        width: 100%;
        border-collapse: collapse;
        margin-bottom: 1em;
        
        th, td {
          border: 1px solid #e0e0e0;
          padding: 8px 12px;
          text-align: left;
        }
        
        th {
          background-color: #f5f5f5;
          font-weight: 600;
        }
      }
      
      blockquote {
        border-left: 4px solid #2196f3;
        padding-left: 16px;
        margin-left: 0;
        color: #666;
        font-style: italic;
      }
      
      code {
        background-color: #f5f5f5;
        padding: 2px 6px;
        border-radius: 3px;
        font-family: 'Courier New', monospace;
        font-size: 0.9em;
      }
      
      pre {
        background-color: #f5f5f5;
        padding: 12px;
        border-radius: 4px;
        overflow-x: auto;
        
        code {
          background-color: transparent;
          padding: 0;
        }
      }
      
      strong, b {
        font-weight: 600;
        color: #000;
      }
    }
  }
}

CLI Tools

Local Generation Script

File: scripts/generate-page-explanation.sh

#!/bin/bash

################################################################################
# Generate Page Understanding (Local Development)
#
# Generates AI-powered page explanation for architectural plan pages using
# local project folders. Supports rapid iteration without cloud deployments.
#
# Usage:
#   ./scripts/generate-page-explanation.sh --project-path=PATH [OPTIONS]
#
# Examples:
#   # Generate for all pages in a project
#   ./scripts/generate-page-explanation.sh \
#     --project-path=projects/R2024.0091-2024-10-14
#
#   # Generate for specific file and pages
#   ./scripts/generate-page-explanation.sh \
#     --project-path=projects/R2024.0091-2024-10-14 \
#     --file-id=1 \
#     --page-numbers=1,2,3
#
#   # Force regeneration with verbose logging
#   ./scripts/generate-page-explanation.sh \
#     --project-path=projects/R2024.0091-2024-10-14 \
#     --force \
#     --verbose
#
# Prerequisites:
#   - Java 17+ (Temurin 23 in dev container)
#   - Maven 3.8+
#   - Vertex AI credentials configured
#   - Project structure: files/{file_id}/pages/{page_number}/
#
# What it does:
#   1. Validates project path and structure
#   2. Discovers pages to process
#   3. Calls PageExplanationService for each page
#   4. Generates page-explanation.md files
#   5. Updates metadata.json with generation status
#   6. Outputs summary (pages processed, tokens used, time)
################################################################################

set -e  # Exit on any error

# Color codes for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Helper functions
log_info() { echo -e "${BLUE}ℹ️  $1${NC}"; }
log_success() { echo -e "${GREEN}✅ $1${NC}"; }
log_warning() { echo -e "${YELLOW}⚠️  $1${NC}"; }
log_error() { echo -e "${RED}❌ $1${NC}"; }
log_section() {
    echo ""
    echo -e "${BLUE}================================================${NC}"
    echo -e "${BLUE}$1${NC}"
    echo -e "${BLUE}================================================${NC}"
    echo ""
}

# Default values
PROJECT_PATH=""
FILE_ID=""
PAGE_NUMBERS=""
FORCE=false
VERBOSE=false
MAX_ITERATIONS=3

# Parse arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --project-path=*)
            PROJECT_PATH="${1#*=}"
            shift
            ;;
        --file-id=*)
            FILE_ID="${1#*=}"
            shift
            ;;
        --page-numbers=*)
            PAGE_NUMBERS="${1#*=}"
            shift
            ;;
        --force)
            FORCE=true
            shift
            ;;
        --verbose)
            VERBOSE=true
            shift
            ;;
        --max-phases_completed=*)
            MAX_ITERATIONS="${1#*=}"
            shift
            ;;
        *)
            log_error "Unknown argument: $1"
            exit 1
            ;;
    esac
done

# Validate required arguments
if [ -z "$PROJECT_PATH" ]; then
    log_error "Missing required argument: --project-path"
    echo "Usage: $0 --project-path=PATH [OPTIONS]"
    exit 1
fi

# Validate project path exists
if [ ! -d "$PROJECT_PATH" ]; then
    log_error "Project path does not exist: $PROJECT_PATH"
    exit 1
fi

log_section "Page Understanding Generation"

log_info "Project: $PROJECT_PATH"
log_info "File ID: ${FILE_ID:-all files}"
log_info "Page Numbers: ${PAGE_NUMBERS:-all pages}"
log_info "Force Regenerate: $FORCE"
log_info "Verbose Logging: $VERBOSE"
log_info "Max Iterations: $MAX_ITERATIONS"

# Build Java command
log_section "Building Maven Project"

export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64
mvn clean install -DskipTests

# Run generation (using Spring Boot CLI runner or direct service call)
log_section "Generating Page Understanding"

java -cp "target/classes:target/dependency/*" \
    org.codetricks.construction.code.assistant.cli.GeneratePageExplanationCLI \
    --project-path="$PROJECT_PATH" \
    --file-id="$FILE_ID" \
    --page-numbers="$PAGE_NUMBERS" \
    --force="$FORCE" \
    --verbose="$VERBOSE" \
    --max-phases_completed="$MAX_ITERATIONS"

log_success "Page understanding generation complete!"

Project Upgrade and Generation Script

File: scripts/upgrade-project-and-generate.sh

#!/bin/bash

################################################################################
# Upgrade Project to Multi-File Structure and Generate Understanding
#
# Combines project upgrade with page explanation generation for testing.
#
# Usage:
#   ./scripts/upgrade-project-and-generate.sh \
#     --source-project=SOURCE \
#     --target-project=TARGET
#
# Example:
#   ./scripts/upgrade-project-and-generate.sh \
#     --source-project=projects/R2024.0091-2024-10-14 \
#     --target-project=projects/R2024.0091-test-copy
################################################################################

set -e

# ... (Similar structure to above, omitted for brevity) ...

# 1. Copy project
log_section "Copying Project"
cp -r "$SOURCE_PROJECT" "$TARGET_PROJECT"

# 2. Upgrade to multi-file structure
log_section "Upgrading to Multi-File Structure"
./scripts/migrate-to-multi-file.sh --project-path="$TARGET_PROJECT"

# 3. Generate page explanation
log_section "Generating Page Understanding"
./scripts/generate-page-explanation.sh --project-path="$TARGET_PROJECT"

log_success "Project upgraded and understanding generated!"

Deployment Guide

Step 1: Build and Test Locally

# 1. Set Java environment
export JAVA_HOME=/usr/lib/jvm/temurin-23-jdk-arm64

# 2. Build project
mvn clean install

# 3. Run unit tests
mvn test -Dtest=PageExplanationServiceTest

# 4. Test local generation
./scripts/generate-page-explanation.sh \
  --project-path=projects/R2024.0091-2024-10-14 \
  --file-id=1 \
  --page-numbers=3 \
  --verbose

Step 2: Deploy Backend to Cloud Run

# 1. Build Docker image
gcloud builds submit --tag gcr.io/PROJECT_ID/architectural-plan-service

# 2. Deploy to Cloud Run
gcloud run deploy architectural-plan-service \
  --image gcr.io/PROJECT_ID/architectural-plan-service \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated

# 3. Verify deployment
curl https://YOUR_CLOUD_RUN_URL/health

Step 3: Deploy Frontend to Cloud Storage

# 1. Build Angular app
cd web-ng-m3
npm run build

# 2. Deploy to Cloud Storage
gsutil -m rsync -r -d dist/web-ng-m3 gs://YOUR_BUCKET/

# 3. Invalidate CDN cache (if using Cloud CDN)
gcloud compute url-maps invalidate-cdn-cache URL_MAP_NAME --path "/*"

Step 4: Test End-to-End

Open application in browser
Navigate to a plan page
Click "Details" tab
Verify understanding generation or display
Check browser console for errors
Verify markdown rendering

Performance Optimizations

1. Prompt Caching

// Cache PDF image across all turns to reduce costs by 90%
LLMRequest request = LLMRequest.builder()
    .imageBytes(pagePdfBytes)
    .enableCaching(true)
    .cacheImageMarker(true)  // Mark for caching
    .build();

// Subsequent turns reuse cached image
// Cost: $0.30/1M tokens (cached) vs $3.00/1M (non-cached)

2. Batch Processing

// Process multiple pages with same PDF (file-level batching)
public void batchGenerateForFile(String projectId, String fileId, List<Integer> pageNumbers) {
    // Load PDF once
    byte[] filePdfBytes = loadFilePdf(projectId, fileId);
    
    // Cache PDF at file level
    String cacheKey = cachingManager.cachePdf(filePdfBytes);
    
    // Process each page with cached PDF
    for (int pageNumber : pageNumbers) {
        processPage(projectId, fileId, pageNumber, cacheKey);
    }
}

3. Asynchronous Processing

// Use Cloud Run Jobs for background processing
@Async
public CompletableFuture<GeneratePageExplanationResponse> generateAsync(
        GeneratePageExplanationRequest request) {
    
    GeneratePageExplanationResponse response = generatePageExplanation(request);
    return CompletableFuture.completedFuture(response);
}

Security Implementation

1. Access Control

// Verify user has access to project before generating
@PreAuthorize("hasProjectAccess(#request.projectId)")
public GeneratePageExplanationResponse generatePageExplanation(
        GeneratePageExplanationRequest request) {
    // ...
}

2. Rate Limiting

// Limit generation requests per user to prevent abuse
@RateLimited(maxRequests = 10, windowSeconds = 3600)
public GeneratePageExplanationResponse generatePageExplanation(
        GeneratePageExplanationRequest request) {
    // ...
}

Observability and Trajectory Tracking

Agent Trajectory Capture

Implementation: Capture complete agent execution trace using AgentTrajectory proto.

File: src/main/proto/features/agent_trajectory.proto

Integration with Existing Infrastructure:

Reuses existing LlmTrace for individual LLM calls
Extends with iteration and tool tracking
Stores in BigQuery + GCS for dual access patterns

Trajectory Builder

public class AgentTrajectoryBuilder {
    private final String trajectoryId;
    private final List<AgentIteration> iterations = new ArrayList<>();
    private final Instant startedAt;
    
    public void startIteration(int iterationNumber) {
        currentIteration = AgentIteration.newBuilder()
            .setIterationNumber(iterationNumber)
            .setStartedAt(Timestamps.fromMillis(System.currentTimeMillis()))
            .build();
    }
    
    public void recordLlmTurn(int turnNumber, String phaseName, LlmTrace llmTrace) {
        AgentTurn turn = AgentTurn.newBuilder()
            .setTurnNumber(turnNumber)
            .setIterationNumber(currentIteration.getIterationNumber())
            .setTurnType(TurnType.LLM_CALL)
            .setLlmTurn(LlmTurn.newBuilder()
                .setModelName(llmTrace.getModelName())
                .setPhaseName(phaseName)
                .setLlmTrace(llmTrace)
                .setInputTokens(llmTrace.getUsageMetadata().getPromptTokenCount())
                .setOutputTokens(llmTrace.getUsageMetadata().getCandidatesTokenCount())
                .setCachedTokens(llmTrace.getUsageMetadata().getCachedContentTokenCount())
                .build())
            .build();
        
        currentIterationTurns.add(turn);
    }
    
    public AgentTrajectory build() {
        return AgentTrajectory.newBuilder()
            .setTrajectoryId(trajectoryId)
            .addAllIterations(iterations)
            .setTotalTurns(getTotalTurnCount())
            .setCostAnalysis(aggregateCosts())
            .build();
    }
}

ADK Callbacks for Trajectory Capture

public class TrajectoryTrackingCallback implements AfterModelCallbackSync {
    private final AgentTrajectoryBuilder trajectoryBuilder;
    private int turnCounter = 0;
    
    @Override
    public Maybe<Content> call(CallbackContext ctx) {
        turnCounter++;
        
        // Extract LLM trace from context
        LlmTrace llmTrace = buildLlmTraceFromContext(ctx);
        
        // Determine phase from agent state
        String phase = determinePhaseName(ctx);  // "GENERATE", "REFLECT", "REFINE"
        
        // Record in trajectory
        trajectoryBuilder.recordLlmTurn(turnCounter, phase, llmTrace);
        
        // Also log to BigQuery (existing infrastructure)
        llmLogTracer.logTrace(llmTrace);
        
        return Maybe.empty();
    }
    
    private String determinePhaseName(CallbackContext ctx) {
        // Analyze tool calls or prompt content to determine phase
        // Look for keywords: "generate", "reflect", "refine"
        String lastToolCalled = ctx.invocationContext().getLastToolName();
        if (lastToolCalled.contains("generate")) return "GENERATE";
        if (lastToolCalled.contains("reflect")) return "REFLECT";
        if (lastToolCalled.contains("refine")) return "REFINE";
        return "UNKNOWN";
    }
}

Trajectory Storage

GCS Path: projects/{projectId}/traces/page-explanation/{trajectory_id}.json

JSON Export:

public String exportTrajectoryAsJson(AgentTrajectory trajectory, boolean prettyPrint) {
    JsonFormat.Printer printer = JsonFormat.printer();
    if (prettyPrint) {
        printer = printer.includingDefaultValueFields();
    }
    return printer.print(trajectory);
}

Firestore Index (for searching):

Collection: agent_trajectories
Document ID: {trajectory_id}
Fields:
  - workflow_name: "page_explanation"
  - project_id: "R2024.0091"
  - file_id: "1"
  - page_number: 3
  - started_at: timestamp
  - total_duration_ms: 105000
  - total_turns: 3
  - iterations_completed: 1
  - final_quality_score: 0.85
  - total_cost_usd: 0.038
  - gcs_path: "projects/.../traces/..."

CLI Tool: Export Trajectory

#!/bin/bash
# cli/codeproof.sh export-trajectory

TRAJECTORY_ID=$1
OUTPUT_FILE=${2:-trajectory.json}

# Call gRPC API
grpcurl -d '{
  "trajectory_id": "'$TRAJECTORY_ID'",
  "include_full_llm_traces": true,
  "pretty_print": true
}' \
  localhost:8080 \
  PageExplanationService/ExportAgentTrajectory \
  | jq '.trajectory_json' -r > $OUTPUT_FILE

echo "Trajectory exported to: $OUTPUT_FILE"

Future: Trajectory Visualization UI

Component: TrajectoryViewer (Angular)

Features:

Timeline view of all turns
Expandable sections for each iteration
Diff view between draft versions
Cost breakdown visualization
Quality score progression chart
Search/filter by phase, model, cost

Mockup:

┌─────────────────────────────────────────────────┐
│ Agent Trajectory: page_explanation              │
│ Project: R2024.0091 | Page: 3 | File: 1         │
│ Duration: 1m 45s | Cost: $0.038 | Quality: 0.85│
├─────────────────────────────────────────────────┤
│ 📊 Quality Score: [▓▓▓▓▓▓▓▓▓░] 85%            │
│ 💰 Cost Breakdown:                              │
│   ├─ Gemini Pro (2 turns): $0.038              │
│   └─ Gemini Flash (1 turn): $0.0004            │
├─────────────────────────────────────────────────┤
│ Iteration 1 ▼                                   │
│  ├─ Turn 1 (GENERATE) - Gemini Pro             │
│  │   Input: 7500 tokens | Output: 2000         │
│  │   Cost: $0.019 | Time: 35s                  │
│  │   📄 View Prompt | View Response             │
│  ├─ Turn 2 (REFLECT) - Gemini Flash            │
│  │   Quality: 0.75 | Gaps: 3 identified        │
│  │   Cost: $0.0004 | Time: 5s                  │
│  │   📊 View Reflection JSON                    │
│  └─ Turn 3 (REFINE) - Gemini Pro               │
│      Cached: 5000 tokens | Cost: $0.019        │
│      Quality: 0.85 ✓ Threshold met              │
└─────────────────────────────────────────────────┘

Troubleshooting

Issue: High Token Costs

Symptoms: Token usage exceeds budget, costs higher than expected

Solutions:

Verify prompt caching is enabled and working
Check cache hit rate in logs
Use cheaper models for reflection turns (other efficient models)
Reduce max_phases_completed to 2 instead of 3

Issue: Slow Generation

Symptoms: Takes >5 minutes per page

Solutions:

Check LLM API latency
Reduce image resolution for PDF
Use async processing (don't block user)
Optimize prompts to reduce output tokens

Issue: Poor Quality Output

Symptoms: Markdown is not educational or contains errors

Solutions:

Review and improve prompt templates
Increase max_phases_completed to 4 or 5
Add example outputs to prompts
Use higher temperature for creativity
Collect human feedback for prompt tuning

Overview​

Architecture Overview​

System Components​

Data Flow: Page Understanding Generation​

Proto Message Definitions​

Import Existing Cost Analysis​

New Messages for Page Interpretation​

New gRPC Service Methods​

Multi-Model Strategy (Detailed)​

Model Selection by Task Type​

Cost Optimization Examples​

Example 1: Single-Model (Phase 1 Implementation)​

Example 2: Multi-Model (Phase 2 Optimization)​

Example 3: Dynamic Model Selection (Advanced)​

Model Capabilities Matrix​

Turn-Specific Model Recommendations​

ADK (Agent Development Kit) Integration​

Phase 1 MVP: Pragmatic 3-Tool Implementation​

Overview​

Why ADK for Page Explanation?​

Phase 1 MVP: 3-Tool Architecture​

ADK Implementation Pattern​

Multi-Model Support with ADK​

Iterative Workflow with ADK​

ADK Callbacks for Tracking​

Comparison: ADK vs Custom Workflow​

Maven Configuration​

ADK Best Practices (from existing code)​

Implementation Details​

Backend Implementation​

1. PageExplanationService​

2. AgenticPageInterpreter (Agentic Workflow Engine)​

3. Prompt Templates​

Guidelines

CLI Tools​

Local Generation Script​

Project Upgrade and Generation Script​

Deployment Guide​

Step 1: Build and Test Locally​

Step 2: Deploy Backend to Cloud Run​

Step 3: Deploy Frontend to Cloud Storage​

Step 4: Test End-to-End​

Performance Optimizations​

1. Prompt Caching​

2. Batch Processing​

3. Asynchronous Processing​

Security Implementation​

1. Access Control​

2. Rate Limiting​

Observability and Trajectory Tracking​

Agent Trajectory Capture​

Trajectory Builder​

ADK Callbacks for Trajectory Capture​

Trajectory Storage​

CLI Tool: Export Trajectory​

Future: Trajectory Visualization UI​

Troubleshooting​

Issue: High Token Costs​

Issue: Slow Generation​

Issue: Poor Quality Output​

References​

Overview

Architecture Overview

System Components

Data Flow: Page Understanding Generation

Proto Message Definitions

Import Existing Cost Analysis

New Messages for Page Interpretation

New gRPC Service Methods

Multi-Model Strategy (Detailed)

Model Selection by Task Type

Cost Optimization Examples

Example 1: Single-Model (Phase 1 Implementation)

Example 2: Multi-Model (Phase 2 Optimization)

Example 3: Dynamic Model Selection (Advanced)

Model Capabilities Matrix

Turn-Specific Model Recommendations

ADK (Agent Development Kit) Integration

Phase 1 MVP: Pragmatic 3-Tool Implementation

Overview

Why ADK for Page Explanation?

Phase 1 MVP: 3-Tool Architecture

ADK Implementation Pattern

Multi-Model Support with ADK

Iterative Workflow with ADK

ADK Callbacks for Tracking

Comparison: ADK vs Custom Workflow

Maven Configuration

ADK Best Practices (from existing code)

Implementation Details

Backend Implementation

1. PageExplanationService

2. AgenticPageInterpreter (Agentic Workflow Engine)

3. Prompt Templates

CLI Tools

Local Generation Script

Project Upgrade and Generation Script

Deployment Guide

Step 1: Build and Test Locally

Step 2: Deploy Backend to Cloud Run

Step 3: Deploy Frontend to Cloud Storage

Step 4: Test End-to-End

Performance Optimizations

1. Prompt Caching

2. Batch Processing

3. Asynchronous Processing

Security Implementation

1. Access Control

2. Rate Limiting

Observability and Trajectory Tracking

Agent Trajectory Capture

Trajectory Builder

ADK Callbacks for Trajectory Capture

Trajectory Storage

CLI Tool: Export Trajectory

Future: Trajectory Visualization UI

Troubleshooting

Issue: High Token Costs

Issue: Slow Generation

Issue: Poor Quality Output

References