Skip to main content

File Structure Reorganization

πŸ“‹ Product Requirements: File Structure Reorganization PRD
πŸ“‹ Implementation Issue: Issue #167

Overview​

This Technical Design Document details the implementation of hierarchical file structure with rich metadata, replacing the flat pages/ directory with files/{file_id}/pages/ while maintaining full backward compatibility with legacy projects.

Architecture Overview​

System Components​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend (Angular) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ FileMetadataList β”‚ β”‚ LegacyUpgradeBanner β”‚ β”‚
β”‚ β”‚ Component β”‚ β”‚ Component β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ gRPC-Web β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β–Ό β”‚
β”‚ gRPC Gateway (Envoy) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β–Ό Backend (Java/Spring) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ArchitecturalPlanService (Facade) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€-───▼─────────────┐ β”‚
β”‚ β”‚ InputFileMetadata β”‚ β”‚ FileStructureMigration β”‚ β”‚
β”‚ β”‚ Service β”‚ β”‚ Service β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ └────-β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ProjectPathResolver β”‚ β”‚
β”‚ β”‚ (Transparent Legacy Fallback + Path Caching) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Cloud Storage (GCS) β”‚
β”‚ projects/{projectId}/ β”‚
β”‚ β”œβ”€β”€ files/{file_id}/ ← NEW β”‚
β”‚ β”‚ β”œβ”€β”€ metadata.json ← NEW β”‚
β”‚ β”‚ └── pages/{pageNumber}/ ← NEW β”‚
β”‚ β”œβ”€β”€ pages/{pageNumber}/ ← LEGACY β”‚
β”‚ └── inputs/{filename} ← UNCHANGED β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow: Read Operations (Simplified Strategy)​

User Request (projectId, pageNum, optional fileId)
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ProjectPathResolver β”‚
β”‚ .resolvePagePath(projectId, pageNum, fileId?)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Is fileId provided?
β”‚
β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
β”‚ Yes β”‚ No (Legacy Fallback Only)
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Is path cached?
β”‚ Modern: Direct Path β”‚ β”‚
β”‚ (String Construction) β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
β”‚ files/{fileId}/pages/ β”‚ β”‚ Yes β”‚ No
β”‚ β”‚ β–Ό β–Ό
β”‚ Performance: 0ms β”‚ Return β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ No I/O, No Cache β”‚ cached β”‚ Legacy Structure: β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ path β”‚ buildLegacyPageFolderPath() β”‚
β”‚ β”‚ + exists() check β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β”‚ File exists?
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
β”‚ β”‚ Yes β”‚ No
β”‚ β–Ό β–Ό
β”‚ Return Throw
β”‚ legacy PageNotFound
β”‚ path (Modern projects
β”‚ (cache) require file_id)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Return path

Key Design Decisions:

  • βœ… Modern projects MUST provide file_id (page numbers are file-scoped)
  • βœ… No expensive scanning (removed listSubdirectories() call)
  • βœ… Simple caching (1 GCS exists call vs instant HashMap lookup)
  • βœ… Clear contract: "Want modern structure? Provide file_id!"

Data Flow: Write Operations (Selective Write)​

New Page Ingestion
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ IngestArchitecturalPlan β”‚
β”‚ (projectId, fileId, pageData) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Detect Project Structure Versionβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
β”‚ Version? β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LEGACY β”‚ TRANS- β”‚ MODERN β”‚
β”‚ β”‚ ITIONAL β”‚ β”‚
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Write toβ”‚ β”‚ Write to β”‚ β”‚ Write to β”‚
β”‚ pages/ β”‚ β”‚ files/ β”‚ β”‚ files/ ONLY β”‚
β”‚ (compat)β”‚ β”‚ (new path) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Update β”‚
β”‚ plan-metadata.json β”‚
β”‚ (backward compat) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Proto Definitions​

Note: Proto definitions already exist in api.proto (lines 225-274). No changes needed!

Enum Naming Consideration: The current enums use prefixed naming (e.g., DOCUMENT_TYPE_ARCHITECTURAL_PLAN, PROCESSING_STATUS_COMPLETED) which doesn't follow our best practice of using dedicated packages with clean enum values (e.g., ARCHITECTURAL_PLAN in package ...file.metadata). However, since these enums are already deployed and used in production:

  • βœ… For this issue: Keep existing enum structure (no breaking changes)
  • πŸ“ Future enhancement: Consider moving to file_metadata.proto with clean enums (separate refactoring issue)

This aligns with our pragmatic approach: work with what exists, improve incrementally.

Existing Proto Messages (For Reference)​

// Already exists in api.proto (line 225)
import "google/protobuf/timestamp.proto";

message InputFileMetadata {
// Basic file information
string file_id = 1; // Auto-increment ID (e.g., "1", "2", "3")
string file_name = 2;
string file_path = 3;
string mime_type = 4;
int64 file_size_bytes = 5;
google.protobuf.Timestamp upload_date = 6; // When file was uploaded

// Document classification
DocumentType document_type = 7;
int32 page_count = 8;

// Processing metadata
ProcessingStatus processing_status = 9;
google.protobuf.Timestamp processed_date = 10; // When processing completed
repeated string extracted_pages = 11;

// Content insights
string content_summary = 12;

// Technical metadata
string checksum_md5 = 13;
}

enum DocumentType {
DOCUMENT_TYPE_UNKNOWN = 0;
DOCUMENT_TYPE_ARCHITECTURAL_PLAN = 1;
DOCUMENT_TYPE_MECHANICAL_PLAN = 2;
DOCUMENT_TYPE_ELECTRICAL_PLAN = 3;
DOCUMENT_TYPE_STRUCTURAL_PLAN = 4;
DOCUMENT_TYPE_INSPECTOR_FEEDBACK = 5;
DOCUMENT_TYPE_PERMIT_APPLICATION = 6;
DOCUMENT_TYPE_CODE_COMPLIANCE_REPORT = 7;
DOCUMENT_TYPE_SITE_PLAN = 8;
DOCUMENT_TYPE_ELEVATION_DRAWING = 9;
DOCUMENT_TYPE_SECTION_DRAWING = 10;
}

enum ProcessingStatus {
PROCESSING_STATUS_UNKNOWN = 0;
PROCESSING_STATUS_UPLOADED = 1;
PROCESSING_STATUS_PROCESSING = 2;
PROCESSING_STATUS_COMPLETED = 3;
PROCESSING_STATUS_FAILED = 4;
}

New Proto Messages for Migration (Add to api.proto)​

// Request to migrate a legacy project to new file structure
message MigrateProjectFileStructureRequest {
// The unique identifier of the project to migrate
string project_id = 1;

// Whether to preserve the legacy pages/ folder after migration
// (default: true for safety)
bool preserve_legacy_structure = 2;

// Whether to run in dry-run mode (preview changes without applying)
bool dry_run = 3;

// User ID initiating the migration (for audit trail)
string initiated_by = 4;
}

// Response from file structure migration
message MigrateProjectFileStructureResponse {
// The unique identifier of the project
string project_id = 1;

// Whether the migration was successful
bool success = 2;

// List of files created with metadata
repeated InputFileMetadata migrated_files = 3;

// Number of pages migrated per file
map<string, int32> pages_per_file = 4;

// Total number of pages migrated
int32 total_pages_migrated = 5;

// Error message if migration failed
string error_message = 6;

// Warnings or informational messages
repeated string warnings = 7;

// Timestamp when migration completed
google.protobuf.Timestamp completed_at = 8;
}

// Request to analyze a project's migration readiness
// Performs a comprehensive check to determine if a project can be safely migrated
// from legacy (flat pages/) structure to modern (hierarchical files/) structure.
message AnalyzeProjectMigrationRequest {
// The unique identifier of the project to analyze
string project_id = 1;
}

// Response containing detailed migration readiness analysis
// Provides all information needed to decide if/when to migrate a project.
message AnalyzeProjectMigrationResponse {
// The unique identifier of the project
string project_id = 1;

// Current project structure version (LEGACY, TRANSITIONAL, or MODERN)
// - LEGACY: Only has pages/ folder β†’ needs migration
// - TRANSITIONAL: Has both pages/ and files/ β†’ migration in progress or partially complete
// - MODERN: Only has files/ folder β†’ already migrated
ProjectStructureVersion current_version = 2;

// Whether project needs migration (true for LEGACY projects only)
// If false, project is already migrated or in transition
bool needs_migration = 3;

// Number of input files found in inputs/ folder
// Used to estimate how many file metadata entries will be created
// Good readiness: > 0 (at least one source file exists)
int32 estimated_file_count = 4;

// Number of existing pages in pages/ folder
// Used to estimate migration workload
// Good readiness: matches actual page count in plan-metadata.json
int32 estimated_page_count = 5;

// Estimated time to complete migration in seconds
// Calculation: (page_count * 1s) + (file_count * 5s)
// Good readiness: < 300s (5 minutes) for typical projects
int32 estimated_duration_seconds = 6;

// Potential issues or blockers that could prevent successful migration
// Examples:
// - "No input files found in inputs/ folder"
// - "Page numbering gaps detected (missing pages 3, 5)"
// - "Insufficient storage space for migration"
// - "Project has no pages to migrate"
// Good readiness: Empty array (no issues)
repeated string issues = 7;

// Human-readable migration readiness assessment
// Examples: "READY", "READY_WITH_WARNINGS", "NOT_READY", "ALREADY_MIGRATED"
// Good readiness: "READY" or "READY_WITH_WARNINGS"
string readiness_status = 8;

// Detailed explanation of readiness status
// Provides context and recommendations
// Example: "Project is ready to migrate. Found 3 input files and 45 pages.
// Estimated time: 2 minutes. No blockers detected."
string readiness_message = 9;
}

// Enum for project structure version
enum ProjectStructureVersion {
PROJECT_STRUCTURE_VERSION_UNKNOWN = 0;
PROJECT_STRUCTURE_VERSION_LEGACY = 1; // Only pages/
PROJECT_STRUCTURE_VERSION_TRANSITIONAL = 2; // Both pages/ and files/
PROJECT_STRUCTURE_VERSION_MODERN = 3; // Only files/
}

Add New RPCs to ArchitecturalPlanService​

service ArchitecturalPlanService {
// ... existing RPCs ...

// Migrates a legacy project to new file structure
// Requires OWNER permissions
rpc MigrateProjectFileStructure(MigrateProjectFileStructureRequest)
returns (MigrateProjectFileStructureResponse) {
option (google.api.http) = {
post: "/v1/architectural-plans/{project_id}/migrate-file-structure"
body: "*"
};
}

// Analyzes a project's migration readiness
rpc AnalyzeProjectMigration(AnalyzeProjectMigrationRequest)
returns (AnalyzeProjectMigrationResponse) {
option (google.api.http) = {
get: "/v1/architectural-plans/{project_id}/migration-analysis"
};
}
}

Update Existing RPC Request Messages (Backward Compatible)​

Critical Update: Existing page-related RPCs must be extended to support the new file structure while maintaining backward compatibility with legacy projects.

Strategy: Optional file_id Field​

Add an optional file_id field to all page-related request messages. This allows:

  • βœ… Modern projects: Pass file_id for direct page access in files/{file_id}/pages/
  • βœ… Legacy projects: Omit file_id, system uses ProjectPathResolver for fallback to pages/
  • βœ… Zero breaking changes: Existing clients continue to work without modifications

Request Messages Requiring Updates​

1. Code Applicability Analysis (api.proto)

message GetApplicableCodeSectionsRequest {
// The unique identifier of the architectural plan to analyze.
string architectural_plan_id = 1;
// The page number of the architectural plan to analyze.
int32 page_number = 2;
string icc_book_id = 3; // Example: 2217 for ICC IBC 2021

// NEW FIELD: Optional file ID for direct file access in modern structure
// If provided, page is accessed via files/{file_id}/pages/{page_number}/
// If omitted, system uses ProjectPathResolver to check files/ first, then pages/ (legacy)
// Example: "1", "2", "3" (auto-incrementing IDs)
string file_id = 4 [deprecated = false]; // Optional, for modern structure support
}

2. Compliance Report Generation (plan.reviewer.proto)

message GetPageSectionComplianceReportRequest {
string architectural_plan_id = 1;
int32 page_number = 2;
string icc_book_id = 3;
string icc_section_id = 4;

// NEW FIELD: Optional file ID for hierarchical file structure
string file_id = 5 [deprecated = false]; // Optional
}

message GetPageComplianceReportRequest {
string architectural_plan_id = 1;
int32 page_number = 2;
string icc_book_id = 3;

// NEW FIELD: Optional file ID for hierarchical file structure
string file_id = 4 [deprecated = false]; // Optional
}

3. Async Compliance Report Task (compliance_report.proto)

message StartPageSectionComplianceReportTaskRequest {
string architectural_plan_id = 1;
int32 page_number = 2;
string icc_book_id = 3;
string icc_section_id = 4;

// NEW FIELD: Optional file ID for hierarchical file structure
string file_id = 5 [deprecated = false]; // Optional
}

4. Analysis Availability Check (analysis_availability.proto)

message GetAvailableAnalysisRequest {
string project_id = 1;
int32 page_number = 2;

// NEW FIELD: Optional file ID for hierarchical file structure
string file_id = 3 [deprecated = false]; // Optional
}

5. File Ingestion Response (api.proto)

Note: IngestFileIntoProjectRequest already has filename and doesn't need file_id as input. However, the response should return the assigned file_id:

message IngestFileIntoProjectResponse {
string project_id = 1;
string filename = 2;
int32 pages_processed = 3;
bool success = 4;

// NEW FIELD: Assigned file ID for the ingested file
// This allows UI to immediately navigate to files/{file_id}/ structure
// Example: "1", "2", "3"
string file_id = 5; // REQUIRED in modern projects
}

Similarly for StartAsyncIngestFileResponse (task.proto):

message StartAsyncIngestFileResponse {
string task_id = 1;
string project_id = 2;
string filename = 3;
int32 page_number = 4;
bool success = 5;
string message = 6;
string completed_at = 7;

// NEW FIELD: Assigned file ID for the ingested file
string file_id = 8; // REQUIRED when ingestion completes
}

Backend Service Implementation Pattern​

When processing requests with the new optional file_id:

public PageApplicabilityAnalysisList getApplicableCodeSections(
GetApplicableCodeSectionsRequest request) {

String projectId = request.getArchitecturalPlanId();
int pageNumber = request.getPageNumber();
String fileId = request.getFileId(); // May be empty/null

// Resolve page path using ProjectPathResolver (handles file_id automatically)
// - If fileId provided β†’ Direct path (fast, no filesystem checks)
// - If fileId null/empty β†’ Dual-read logic (cache β†’ modern β†’ legacy)
String pagePath = pathResolver.resolvePageFolderPath(projectId, pageNumber, fileId);

// Continue with existing logic using resolved path
// ...
}

Key Benefits:

  • βœ… Single source of truth: All path resolution logic centralized in ProjectPathResolver
  • βœ… Automatic optimization: Fast path when file_id provided, dual-read when not
  • βœ… Consistent behavior: Same logic across all services
  • βœ… Easy testing: Mock ProjectPathResolver for unit tests

CLI Updates Required​

The CLI commands (grpcurl, custom scripts) must also be updated to support the new optional parameter:

Example: Legacy CLI call (still works)

grpcurl -d '{
"architectural_plan_id": "project-123",
"page_number": 5,
"icc_book_id": "2217"
}' \
localhost:8080 ArchitecturalPlanReviewService/GetApplicableCodeSections

Example: Modern CLI call (with file_id)

grpcurl -d '{
"architectural_plan_id": "project-123",
"page_number": 5,
"icc_book_id": "2217",
"file_id": "2"
}' \
localhost:8080 ArchitecturalPlanReviewService/GetApplicableCodeSections

Frontend/UI Updates Required​

1. Update API Service Clients (e.g., web-ng-m3/src/app/shared/api.service.ts):

getApplicableCodeSections(
projectId: string,
pageNumber: number,
iccBookId: string,
fileId?: string // NEW optional parameter
): Observable<PageApplicabilityAnalysisList> {
const request: GetApplicableCodeSectionsRequest = {
architectural_plan_id: projectId,
page_number: pageNumber,
icc_book_id: iccBookId
};

// Include file_id only if available (modern projects)
if (fileId) {
request.file_id = fileId;
}

return this.grpcClient.getApplicableCodeSections(request);
}

2. Pass file_id from Components:

When displaying page-specific analysis, components need to know which file the page belongs to. This information comes from:

  • Modern projects: InputFileMetadata.file_id and InputFileMetadata.extracted_pages
  • Legacy projects: file_id is undefined/null, system falls back automatically
// In compliance.component.ts or similar
loadPageAnalysis(pageNumber: number) {
const fileId = this.getFileIdForPage(pageNumber); // NEW method

this.apiService.getApplicableCodeSections(
this.projectId,
pageNumber,
this.iccBookId,
fileId // Pass file_id if available
).subscribe(/* ... */);
}

private getFileIdForPage(pageNumber: number): string | undefined {
// Look up file_id from InputFileMetadata list
const fileMetadata = this.inputFiles.find(
f => f.extracted_pages.includes(String(pageNumber))
);
return fileMetadata?.file_id;
}

Testing Backward Compatibility​

Test Cases:

  1. Legacy Project (no file_id):

    • βœ… Request without file_id β†’ System uses ProjectPathResolver β†’ Falls back to pages/ β†’ Success
  2. Modern Project (with file_id):

    • βœ… Request with file_id β†’ Direct access to files/{file_id}/pages/ β†’ Success
  3. Modern Project (omit file_id):

    • βœ… Request without file_id β†’ ProjectPathResolver checks files/ first β†’ Success
  4. Invalid file_id:

    • ❌ Request with invalid file_id β†’ 404 Page Not Found (expected behavior)

Migration Impact​

Phase 1: Deploy Proto Changes

  • Add optional file_id fields to all request messages
  • Deploy backend changes (proto regeneration)
  • No frontend changes yet β†’ Existing clients continue working (backward compatible)

Phase 2: Update Backend Services

  • Modify service implementations to honor file_id when provided
  • Maintain fallback behavior via ProjectPathResolver
  • No frontend changes yet β†’ Still backward compatible

Phase 3: Update Frontend (Optional)

  • Add file_id tracking in UI state
  • Pass file_id in modern projects for performance optimization
  • Legacy projects continue working without changes

Performance Considerations​

With file_id (modern projects):

  • βœ… Direct path access: No filesystem checks needed
  • βœ… No cache lookups: Skip ProjectPathResolver cache
  • βœ… Faster response: ~50-100ms saved per request

Without file_id (legacy or omitted):

  • ⚠️ ProjectPathResolver overhead: Cache check + potential filesystem existence checks
  • ⚠️ Acceptable performance: <10ms overhead for cached paths, <100ms for uncached

Recommendation: Frontend should pass file_id when available for optimal performance.


Backend Implementation​

1. ProjectPathResolver​

Purpose: Resolves file paths transparently across both legacy (pages/) and modern (files/{file_id}/pages/) structures, providing backward compatibility during migration.

Location: src/main/java/org/codetricks/construction/code/assistant/ProjectPathResolver.java

package org.codetricks.construction.code.assistant;

import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;

import java.io.IOException;
import java.util.Optional;
import java.util.concurrent.TimeUnit;
import java.util.logging.Logger;

/**
* Resolves file paths across both modern (files/{file_id}/pages/) and
* legacy (pages/) project structures with transparent fallback.
*
* <p>Path Resolution Strategy:
* 1. Check in-memory cache first (avoid repeated filesystem checks)
* 2. Try modern structure: files/{file_id}/pages/{page_number}/
* 3. Fall back to legacy: pages/{page_number}/
* 4. Cache the result for future reads
*
* <p>Thread-safe and optimized for read-heavy workloads.
*
* <p>Instantiation: Create via constructor, pass to service implementations
*/
public class ProjectPathResolver {

private static final Logger logger = Logger.getLogger(
ProjectPathResolver.class.getName());

private final FileSystemHandler fileSystemHandler;

// Cache: projectId + pageNumber -> resolved path
private final Cache<String, String> pathCache;

public ProjectPathResolver(FileSystemHandler fileSystemHandler) {
this.fileSystemHandler = fileSystemHandler;
this.pathCache = CacheBuilder.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(1, TimeUnit.HOURS)
.build();
}

/**
* Resolves the page folder path with optional file ID for performance optimization.
*
* <p><b>Path Resolution Strategy:</b>
* <ul>
* <li>If fileId provided: Direct path construction (FAST - no filesystem checks)</li>
* <li>If fileId null/empty: Dual-read logic (check cache β†’ modern β†’ legacy)</li>
* </ul>
*
* @param projectId The unique identifier of the project
* @param pageNumber The page number (1-based)
* @param fileId Optional file ID for direct access (null or empty for auto-detect)
* @return The resolved page folder path
* @throws PageNotFoundException if page doesn't exist in either structure
*/
public String resolvePageFolderPath(String projectId, int pageNumber, String fileId)
throws PageNotFoundException {

// Fast path: If file ID is provided, construct path directly
if (fileId != null && !fileId.isEmpty()) {
return String.format("projects/%s/files/%s/pages/%03d",
projectId, fileId, pageNumber);
}

// Slow path: Auto-detect structure with caching
String cacheKey = getCacheKey(projectId, pageNumber);

// Check cache first
String cachedPath = pathCache.getIfPresent(cacheKey);
if (cachedPath != null) {
return cachedPath;
}

// Try legacy structure only (modern projects MUST provide file_id)
// Page numbers in modern projects are file-scoped, so we can't auto-detect
try {
String legacyPath = buildLegacyPageFolderPath(projectId, pageNumber);

if (fileSystemHandler.exists(legacyPath)) {
pathCache.put(cacheKey, legacyPath);
logger.info(String.format(
"Using legacy page path for project %s, page %d: %s",
projectId, pageNumber, legacyPath));
return legacyPath;
}

// Not found in legacy structure
throw new PageNotFoundException(projectId, pageNumber,
"Page not found in legacy structure. Modern projects require file_id parameter.");

} catch (IOException e) {
throw new PageNotFoundException(projectId, pageNumber, e);
}
}

/**
* Convenience overload for backward compatibility.
* Delegates to main method with fileId = null.
*/
public String resolvePageFolderPath(String projectId, int pageNumber)
throws PageNotFoundException {
return resolvePageFolderPath(projectId, pageNumber, null);
}

/**
* Detects the project structure version.
*
* @param projectId The unique identifier of the project
* @return The detected project structure version
*/
public ProjectStructureVersion detectProjectVersion(String projectId) throws IOException {
boolean hasLegacyPages = fileSystemHandler.exists(
String.format("projects/%s/pages/", projectId));
boolean hasFiles = fileSystemHandler.exists(
String.format("projects/%s/files/", projectId));

if (hasFiles && !hasLegacyPages) {
return ProjectStructureVersion.MODERN;
} else if (hasFiles && hasLegacyPages) {
return ProjectStructureVersion.TRANSITIONAL;
} else if (hasLegacyPages) {
return ProjectStructureVersion.LEGACY;
} else {
return ProjectStructureVersion.UNKNOWN;
}
}

/**
* Clears the path cache for a specific project.
* Call this after migration to ensure fresh path resolution.
*/
public void clearCacheForProject(String projectId) {
pathCache.invalidateAll();
logger.info("Cleared path cache for project: " + projectId);
}

// Private helper methods

private String getCacheKey(String projectId, int pageNumber) {
return projectId + ":" + pageNumber;
}

public enum ProjectStructureVersion {
UNKNOWN,
LEGACY, // Only has pages/
TRANSITIONAL, // Has both pages/ and files/
MODERN // Only has files/
}

public static class PageNotFoundException extends Exception {
public PageNotFoundException(String projectId, int pageNumber) {
super(String.format("Page %d not found in project %s", pageNumber, projectId));
}
}
}

2. InputFileMetadataService​

Purpose: Generate, retrieve, and manage file metadata

Location: src/main/java/org/codetricks/construction/code/assistant/service/InputFileMetadataService.java

package org.codetricks.construction.code.assistant.service;

import com.google.protobuf.util.JsonFormat;
import org.codetricks.construction.code.assistant.FileSystemHandler;

import java.io.IOException;
import java.security.MessageDigest;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
import java.util.logging.Logger;

/**
* Service for managing Project Input File metadata and the hierarchical file structure.
*
* <p>Handles metadata generation, persistence, and retrieval for user-uploaded input files
* (PDFs, images, etc.) that get processed into individual pages.
*
* <p><b>GCS Path Structure:</b>
* <ul>
* <li>Input files: {@code projects/{projectId}/inputs/filename.pdf}</li>
* <li>File metadata: {@code projects/{projectId}/files/{file_id}/metadata.json}</li>
* <li>Extracted pages: {@code projects/{projectId}/files/{file_id}/pages/{page_num}/}</li>
* <li>Legacy pages: {@code projects/{projectId}/pages/{page_num}/} (backward compatibility)</li>
* <li>File index: {@code projects/{projectId}/files/index.json} (file ID counter + mappings)</li>
* </ul>
*
* <p><b>Responsibilities:</b>
* <ul>
* <li>Generate rich metadata (InputFileMetadata proto) for uploaded files</li>
* <li>Assign auto-incrementing file IDs via {@code files/index.json}</li>
* <li>Classify document types (plans, specifications, reports, etc.)</li>
* <li>Persist and retrieve metadata from GCS</li>
* <li>Track processing status and page associations</li>
* </ul>
*
* <p>Thread-safe and idempotent.
*
* <p>Instantiation: Create via constructor, pass FileSystemHandler and DocumentClassificationService
*/
public class InputFileMetadataService {

private static final Logger logger = Logger.getLogger(
InputFileMetadataService.class.getName());

private final FileSystemHandler fileSystemHandler;
private final DocumentClassificationService classificationService;

public InputFileMetadataService(
FileSystemHandler fileSystemHandler,
DocumentClassificationService classificationService) {
this.fileSystemHandler = fileSystemHandler;
this.classificationService = classificationService;
}

/**
* Generates comprehensive metadata for an input file.
*
* @param projectId The project ID
* @param inputFilePath Path to the input file (e.g., "inputs/plans.pdf")
* @param forceRegenerate Whether to overwrite existing metadata
* @return Generated metadata
*/
public InputFileMetadata generateMetadata(
String projectId,
String inputFilePath,
boolean forceRegenerate) throws IOException {

String fullPath = String.format("projects/%s/%s", projectId, inputFilePath);

// Check if metadata already exists
String fileId = extractOrGenerateFileId(projectId, inputFilePath);
String metadataPath = getMetadataPath(projectId, fileId);

if (!forceRegenerate && fileSystemHandler.exists(metadataPath)) {
logger.info("Metadata already exists for file: " + inputFilePath);
return loadMetadata(projectId, fileId);
}

logger.info("Generating metadata for file: " + inputFilePath);

// Build metadata
InputFileMetadata.Builder builder = InputFileMetadata.newBuilder()
.setFileId(fileId)
.setFileName(extractFileName(inputFilePath))
.setFilePath(inputFilePath)
.setMimeType(detectMimeType(fullPath))
.setFileSizeBytes(fileSystemHandler.getFileSize(fullPath))
.setUploadDate(com.google.protobuf.Timestamp.newBuilder()
.setSeconds(Instant.now().getEpochSecond())
.build())
.setProcessingStatus(ProcessingStatus.PROCESSING_STATUS_UPLOADED);

// For PDF files, extract page count
if (fullPath.endsWith(".pdf")) {
int pageCount = extractPageCount(fullPath);
builder.setPageCount(pageCount);
}

// Classify document type (heuristic-based for now, AI later)
DocumentType docType = classificationService.classifyDocument(
builder.getFileName(), fullPath);
builder.setDocumentType(docType);

// Generate checksum
String checksum = generateChecksum(fullPath);
builder.setChecksumMd5(checksum);

InputFileMetadata metadata = builder.build();

// Save metadata to disk
saveMetadata(projectId, fileId, metadata);

logger.info("Generated metadata for file: " + fileId);
return metadata;
}

/**
* Loads existing metadata from disk.
*/
public InputFileMetadata loadMetadata(String projectId, String fileId)
throws IOException {
String metadataPath = getMetadataPath(projectId, fileId);
String metadataJson = fileSystemHandler.readFileAsString(metadataPath);

InputFileMetadata.Builder builder = InputFileMetadata.newBuilder();
JsonFormat.parser().merge(metadataJson, builder);
return builder.build();
}

/**
* Updates metadata with processing results.
*/
public InputFileMetadata updateProcessingStatus(
String projectId,
String fileId,
ProcessingStatus status,
List<String> extractedPages) throws IOException {

InputFileMetadata existing = loadMetadata(projectId, fileId);

InputFileMetadata.Builder builder = existing.toBuilder()
.setProcessingStatus(status)
.setProcessedDate(com.google.protobuf.Timestamp.newBuilder()
.setSeconds(Instant.now().getEpochSecond())
.build())
.clearExtractedPages()
.addAllExtractedPages(extractedPages);

InputFileMetadata updated = builder.build();
saveMetadata(projectId, fileId, updated);

return updated;
}

/**
* Lists all file metadata in a project.
*/
public List<InputFileMetadata> listAllMetadata(String projectId)
throws IOException {
String filesBasePath = String.format("projects/%s/files/", projectId);

if (!fileSystemHandler.exists(filesBasePath)) {
return new ArrayList<>();
}

List<String> fileIds = fileSystemHandler.listDirectories(filesBasePath);
List<InputFileMetadata> metadataList = new ArrayList<>();

for (String fileId : fileIds) {
try {
InputFileMetadata metadata = loadMetadata(projectId, fileId);
metadataList.add(metadata);
} catch (IOException e) {
logger.warning("Failed to load metadata for file: " + fileId);
}
}

return metadataList;
}

// Private helper methods

/**
* Generates a new file ID using auto-increment counter.
* Maintains counter in projects/{projectId}/files/index.json
*/
private String extractOrGenerateFileId(String projectId, String inputFilePath)
throws IOException {
String indexPath = String.format("projects/%s/files/index.json", projectId);

// Load index or create new one
FileIndex index;
if (fileSystemHandler.exists(indexPath)) {
String indexJson = fileSystemHandler.readFileAsString(indexPath);
index = new Gson().fromJson(indexJson, FileIndex.class);
} else {
index = new FileIndex();
index.nextFileId = 1;
index.files = new ArrayList<>();
}

// Generate new file ID
String fileId = String.valueOf(index.nextFileId);
index.nextFileId++;

// Add to index
index.files.add(new FileIndexEntry(fileId, extractFileName(inputFilePath)));

// Save index
String updatedJson = new Gson().toJson(index);
fileSystemHandler.writeFile(indexPath, updatedJson);

return fileId;
}

// Helper classes for file index
private static class FileIndex {
int nextFileId;
List<FileIndexEntry> files;
}

private static class FileIndexEntry {
String fileId;
String fileName;

FileIndexEntry(String fileId, String fileName) {
this.fileId = fileId;
this.fileName = fileName;
}
}

private String getMetadataPath(String projectId, String fileId) {
return String.format("projects/%s/files/%s/metadata.json", projectId, fileId);
}

private String extractFileName(String filePath) {
int lastSlash = filePath.lastIndexOf('/');
return lastSlash >= 0 ? filePath.substring(lastSlash + 1) : filePath;
}

private String detectMimeType(String filePath) {
if (filePath.endsWith(".pdf")) {
return "application/pdf";
}
return "application/octet-stream";
}

private int extractPageCount(String pdfPath) throws IOException {
// Use Apache PDFBox to get page count
try (org.apache.pdfbox.pdmodel.PDDocument document =
org.apache.pdfbox.pdmodel.PDDocument.load(
fileSystemHandler.readFileAsBytes(pdfPath))) {
return document.getNumberOfPages();
}
}

private String generateChecksum(String filePath) throws IOException {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] fileBytes = fileSystemHandler.readFileAsBytes(filePath);
byte[] hashBytes = md.digest(fileBytes);

StringBuilder sb = new StringBuilder();
for (byte b : hashBytes) {
sb.append(String.format("%02x", b));
}
return sb.toString();
} catch (Exception e) {
logger.warning("Failed to generate checksum: " + e.getMessage());
return "";
}
}

private void saveMetadata(
String projectId,
String fileId,
InputFileMetadata metadata) throws IOException {
String metadataPath = getMetadataPath(projectId, fileId);
String metadataJson = JsonFormat.printer()
.preservingProtoFieldNames()
.print(metadata);
fileSystemHandler.writeFile(metadataPath, metadataJson);
}
}

3. Migration Readiness Assessment​

Purpose: Determine if a project is safe to migrate and provide actionable recommendations

Readiness Status Values​

StatusConditionCan Migrate?Description
READYLegacy project, has input files and pages, no issuesβœ… YesIdeal state for migration
READY_WITH_WARNINGSLegacy project, has pages, but minor issues (e.g., no input files)βœ… YesCan proceed but review warnings
NOT_READYLegacy project but critical blockers (e.g., no pages)❌ NoFix issues before migrating
ALREADY_MIGRATEDModern structure detectedN/ANo action needed
MIGRATION_IN_PROGRESSTransitional state (both structures exist)⚠️ CautionLikely interrupted migration

Assessment Logic​

// Pseudo-code for readiness assessment
if (currentVersion == MODERN) {
return "ALREADY_MIGRATED";
} else if (currentVersion == TRANSITIONAL) {
return "MIGRATION_IN_PROGRESS"; // Investigate before retrying
} else if (legacyPageCount == 0) {
return "NOT_READY"; // Nothing to migrate
} else if (inputFileCount == 0) {
return "READY_WITH_WARNINGS"; // Will create default file entry
} else {
return "READY"; // Green light!
}

Common Issues and Resolutions​

IssueSeverityResolution
No input files in inputs/WarningCreate default file entry with ID unknown-source
No pages in pages/BlockerCannot migrate empty project
Page numbering gapsWarningProceed anyway, gaps will be preserved
Both structures existWarningLikely interrupted migration, investigate before retrying
Insufficient storageBlockerFree up space or increase quota

4. FileStructureMigrationService​

Purpose: Migrate legacy projects to new structure

Location: src/main/java/org/codetricks/construction/code/assistant/service/FileStructureMigrationService.java

package org.codetricks.construction.code.assistant.service;

import java.io.IOException;
import java.util.*;
import java.util.logging.Logger;

/**
* Service for migrating legacy projects from flat pages/ structure to
* hierarchical files/{file_id}/pages/ structure.
*
* <p>Migration Strategy:
* 1. Analyze inputs/ folder to identify source files
* 2. Generate file metadata for each input file
* 3. Associate existing pages with source files (best effort heuristic)
* 4. Copy pages to new files/{file_id}/pages/ structure
* 5. Keep legacy pages/ intact for rollback
* 6. Update plan-metadata.json with new paths
*
* <p>Thread-safe and idempotent.
*
* <p>Instantiation: Create via constructor, pass FileSystemHandler,
* InputFileMetadataService, and ProjectPathResolver
*/
public class FileStructureMigrationService {

private static final Logger logger = Logger.getLogger(
FileStructureMigrationService.class.getName());

private final FileSystemHandler fileSystemHandler;
private final InputFileMetadataService metadataService;
private final ProjectPathResolver pathResolver;

public FileStructureMigrationService(
FileSystemHandler fileSystemHandler,
InputFileMetadataService metadataService,
ProjectPathResolver pathResolver) {
this.fileSystemHandler = fileSystemHandler;
this.metadataService = metadataService;
this.pathResolver = pathResolver;
}

/**
* Analyzes a project to determine migration readiness.
* Provides comprehensive assessment including blockers, estimates, and recommendations.
*/
public MigrationAnalysis analyzeProject(String projectId) throws IOException {
logger.info("Analyzing project for migration: " + projectId);

MigrationAnalysis analysis = new MigrationAnalysis();
analysis.projectId = projectId;
analysis.currentVersion = dualReadHandler.detectProjectVersion(projectId);
analysis.issues = new ArrayList<>();

// Count input files
String inputsPath = String.format("projects/%s/inputs/", projectId);
if (fileSystemHandler.exists(inputsPath)) {
analysis.inputFileCount = fileSystemHandler.listFiles(inputsPath).size();
}

// Count legacy pages
String pagesPath = String.format("projects/%s/pages/", projectId);
if (fileSystemHandler.exists(pagesPath)) {
analysis.legacyPageCount = fileSystemHandler.listDirectories(pagesPath).size();
}

// Determine if migration is needed
analysis.needsMigration = (analysis.currentVersion ==
ProjectPathResolver.ProjectStructureVersion.LEGACY);

// Estimate duration (rough estimate: 1 second per page + 5 seconds per file)
analysis.estimatedDurationSeconds =
(analysis.legacyPageCount * 1) + (analysis.inputFileCount * 5);

// Check for blockers and warnings
if (analysis.inputFileCount == 0) {
analysis.issues.add("No input files found in inputs/ folder - will create default file entry");
}

if (analysis.legacyPageCount == 0) {
analysis.issues.add("No pages found in pages/ folder - nothing to migrate");
}

// Assess readiness status
if (analysis.currentVersion == ProjectPathResolver.ProjectStructureVersion.MODERN) {
analysis.readinessStatus = "ALREADY_MIGRATED";
analysis.readinessMessage = "Project has already been migrated to the new file structure.";
} else if (analysis.currentVersion == ProjectPathResolver.ProjectStructureVersion.TRANSITIONAL) {
analysis.readinessStatus = "MIGRATION_IN_PROGRESS";
analysis.readinessMessage = "Project migration is in progress or partially complete. " +
"Both legacy and modern structures exist.";
} else if (!analysis.issues.isEmpty() && analysis.legacyPageCount == 0) {
analysis.readinessStatus = "NOT_READY";
analysis.readinessMessage = "Project cannot be migrated: no pages found.";
} else if (!analysis.issues.isEmpty()) {
analysis.readinessStatus = "READY_WITH_WARNINGS";
analysis.readinessMessage = String.format(
"Project can be migrated with warnings. Found %d input files and %d pages. " +
"Estimated time: %d seconds. Issues: %s",
analysis.inputFileCount, analysis.legacyPageCount,
analysis.estimatedDurationSeconds, String.join("; ", analysis.issues));
} else {
analysis.readinessStatus = "READY";
analysis.readinessMessage = String.format(
"Project is ready to migrate. Found %d input files and %d pages. " +
"Estimated time: %d seconds. No blockers detected.",
analysis.inputFileCount, analysis.legacyPageCount,
analysis.estimatedDurationSeconds);
}

logger.info("Analysis complete: " + analysis);
return analysis;
}

/**
* Migrates a project to new file structure.
*
* @param projectId The project to migrate
* @param preserveLegacy Whether to keep pages/ folder after migration
* @param dryRun If true, only preview changes without applying
* @return Migration result
*/
public MigrationResult migrateProject(
String projectId,
boolean preserveLegacy,
boolean dryRun) throws IOException {

logger.info(String.format(
"Starting migration for project %s (dryRun=%s, preserve=%s)",
projectId, dryRun, preserveLegacy));

MigrationResult result = new MigrationResult();
result.projectId = projectId;
result.startTime = System.currentTimeMillis();

try {
// Step 1: Analyze input files
List<String> inputFiles = discoverInputFiles(projectId);
logger.info("Discovered " + inputFiles.size() + " input files");

if (inputFiles.isEmpty()) {
// No input files - create a default file for all pages
inputFiles.add(createDefaultFileEntry(projectId));
}

// Step 2: Generate metadata for each input file
Map<String, InputFileMetadata> fileMetadataMap = new HashMap<>();
for (String inputFile : inputFiles) {
InputFileMetadata metadata = metadataService.generateMetadata(
projectId, inputFile, false /* don't force regenerate */);
fileMetadataMap.put(metadata.getFileId(), metadata);
result.migratedFiles.add(metadata);
}

// Step 3: Associate pages with files (heuristic-based)
Map<String, List<Integer>> fileToPages = associatePagesWithFiles(
projectId, fileMetadataMap);

// Step 4: Migrate pages to new structure
for (Map.Entry<String, List<Integer>> entry : fileToPages.entrySet()) {
String fileId = entry.getKey();
List<Integer> pageNumbers = entry.getValue();

for (int pageNumber : pageNumbers) {
if (!dryRun) {
migratePageToNewStructure(projectId, fileId, pageNumber);
}
result.totalPagesMigrated++;
}

result.pagesPerFile.put(fileId, pageNumbers.size());
}

// Step 5: Update metadata with extracted pages
if (!dryRun) {
for (Map.Entry<String, List<Integer>> entry : fileToPages.entrySet()) {
String fileId = entry.getKey();
List<String> pageIds = entry.getValue().stream()
.map(String::valueOf)
.toList();
metadataService.updateProcessingStatus(
projectId, fileId, ProcessingStatus.PROCESSING_STATUS_COMPLETED, pageIds);
}
}

// Step 6: Optionally remove legacy pages/ folder
if (!preserveLegacy && !dryRun) {
String legacyPagesPath = String.format("projects/%s/pages/", projectId);
fileSystemHandler.deleteDirectory(legacyPagesPath);
logger.info("Removed legacy pages/ folder");
}

// Step 7: Clear path cache to force re-resolution
if (!dryRun) {
dualReadHandler.clearCacheForProject(projectId);
}

result.success = true;
logger.info("Migration completed successfully");

} catch (Exception e) {
result.success = false;
result.errorMessage = e.getMessage();
logger.severe("Migration failed: " + e.getMessage());
e.printStackTrace();
}

result.endTime = System.currentTimeMillis();
return result;
}

// Private helper methods

private List<String> discoverInputFiles(String projectId) throws IOException {
String inputsPath = String.format("projects/%s/inputs/", projectId);

if (!fileSystemHandler.exists(inputsPath)) {
return new ArrayList<>();
}

return fileSystemHandler.listFiles(inputsPath).stream()
.map(filename -> "inputs/" + filename)
.toList();
}

private String createDefaultFileEntry(String projectId) {
// For projects with no input files, create a placeholder
return "inputs/unknown-source.pdf";
}

private Map<String, List<Integer>> associatePagesWithFiles(
String projectId,
Map<String, InputFileMetadata> fileMetadataMap) throws IOException {

// Simple heuristic: Distribute pages evenly across files based on page count
Map<String, List<Integer>> fileToPages = new HashMap<>();

// Get list of all legacy pages
String legacyPagesPath = String.format("projects/%s/pages/", projectId);
List<Integer> allPages = fileSystemHandler.listDirectories(legacyPagesPath).stream()
.map(Integer::parseInt)
.sorted()
.toList();

if (allPages.isEmpty()) {
return fileToPages;
}

// If only one file, assign all pages to it
if (fileMetadataMap.size() == 1) {
String fileId = fileMetadataMap.keySet().iterator().next();
fileToPages.put(fileId, new ArrayList<>(allPages));
return fileToPages;
}

// Otherwise, distribute based on page count in each file
List<InputFileMetadata> sortedFiles = fileMetadataMap.values().stream()
.sorted(Comparator.comparingInt(InputFileMetadata::getPageCount))
.toList();

int currentPageIndex = 0;
for (InputFileMetadata metadata : sortedFiles) {
int pageCount = metadata.getPageCount();
List<Integer> assignedPages = new ArrayList<>();

for (int i = 0; i < pageCount && currentPageIndex < allPages.size(); i++) {
assignedPages.add(allPages.get(currentPageIndex++));
}

fileToPages.put(metadata.getFileId(), assignedPages);
}

// Assign remaining pages to last file (edge case)
if (currentPageIndex < allPages.size()) {
String lastFileId = sortedFiles.get(sortedFiles.size() - 1).getFileId();
List<Integer> lastFilePages = fileToPages.get(lastFileId);
while (currentPageIndex < allPages.size()) {
lastFilePages.add(allPages.get(currentPageIndex++));
}
}

return fileToPages;
}

private void migratePageToNewStructure(
String projectId,
String fileId,
int pageNumber) throws IOException {

String legacyPath = String.format("projects/%s/pages/%03d/", projectId, pageNumber);
String newPath = String.format("projects/%s/files/%s/pages/%03d/",
projectId, fileId, pageNumber);

// Copy entire page folder to new location
fileSystemHandler.copyDirectory(legacyPath, newPath);

logger.info(String.format("Migrated page %d to file %s", pageNumber, fileId));
}

// Data classes

public static class MigrationAnalysis {
public String projectId;
public ProjectPathResolver.ProjectStructureVersion currentVersion;
public boolean needsMigration;
public int inputFileCount;
public int legacyPageCount;
public int estimatedDurationSeconds;
public List<String> issues;
public String readinessStatus; // READY, READY_WITH_WARNINGS, NOT_READY, ALREADY_MIGRATED
public String readinessMessage;

@Override
public String toString() {
return String.format(
"MigrationAnalysis{project=%s, version=%s, readiness=%s, " +
"input_files=%d, pages=%d, est_duration=%ds, issues=%d}",
projectId, currentVersion, readinessStatus,
inputFileCount, legacyPageCount, estimatedDurationSeconds,
issues != null ? issues.size() : 0);
}
}

public static class MigrationResult {
public String projectId;
public boolean success;
public List<InputFileMetadata> migratedFiles = new ArrayList<>();
public Map<String, Integer> pagesPerFile = new HashMap<>();
public int totalPagesMigrated;
public String errorMessage;
public long startTime;
public long endTime;

public long getDurationSeconds() {
return (endTime - startTime) / 1000;
}
}
}

5. DocumentClassificationService​

Purpose: Classify document type (heuristic-based initially)

Location: src/main/java/org/codetricks/construction/code/assistant/service/DocumentClassificationService.java

package org.codetricks.construction.code.assistant.service;

import java.util.regex.Pattern;

/**
* Service for classifying document types based on filename and content.
* Uses heuristic rules initially, can be enhanced with AI classification later.
*
* <p>Instantiation: Stateless utility, can be instantiated with default constructor
*/
public class DocumentClassificationService {

// Patterns for document type detection
private static final Pattern ARCHITECTURAL_PATTERN = Pattern.compile(
"(?i).*(architectural|arch|floor[\\s-]?plan|site[\\s-]?plan|elevation|section).*");

private static final Pattern ELECTRICAL_PATTERN = Pattern.compile(
"(?i).*(electrical|elec|power|lighting).*");

private static final Pattern MECHANICAL_PATTERN = Pattern.compile(
"(?i).*(mechanical|mech|hvac|plumbing|mep).*");

private static final Pattern STRUCTURAL_PATTERN = Pattern.compile(
"(?i).*(structural|struct|foundation|framing).*");

private static final Pattern PERMIT_PATTERN = Pattern.compile(
"(?i).*(permit|application|approval).*");

private static final Pattern INSPECTION_PATTERN = Pattern.compile(
"(?i).*(inspector|inspection|feedback|corrections).*");

/**
* Classifies a document based on filename and optional content analysis.
*
* @param filename The name of the file
* @param filePath Optional path to file for content analysis (future enhancement)
* @return The classified document type
*/
public DocumentType classifyDocument(String filename, String filePath) {
// Heuristic-based classification using filename patterns

if (ARCHITECTURAL_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_ARCHITECTURAL_PLAN;
}

if (ELECTRICAL_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_ELECTRICAL_PLAN;
}

if (MECHANICAL_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_MECHANICAL_PLAN;
}

if (STRUCTURAL_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_STRUCTURAL_PLAN;
}

if (PERMIT_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_PERMIT_APPLICATION;
}

if (INSPECTION_PATTERN.matcher(filename).matches()) {
return DocumentType.DOCUMENT_TYPE_INSPECTOR_FEEDBACK;
}

// Default to unknown if no pattern matches
return DocumentType.DOCUMENT_TYPE_UNKNOWN;
}

// Future enhancement: AI-based classification using LLM
public DocumentType classifyDocumentWithAI(String filePath) {
// TODO: Implement LLM-based classification
// 1. Extract first page or sample of content
// 2. Call LLM with prompt: "Classify this construction document..."
// 3. Parse LLM response to DocumentType enum
// 4. Fall back to heuristic if LLM fails
throw new UnsupportedOperationException("AI classification not yet implemented");
}
}

Frontend Implementation​

1. File Metadata List Component​

Purpose: Display list of files with rich metadata

Location: web-ng-m3/src/app/components/project/settings/file-metadata-list/file-metadata-list.component.ts

import { Component, Input, OnInit } from '@angular/core';
import { InputFileMetadata, DocumentType, ProcessingStatus } from '@generated/api_pb';
import { ArchitecturalPlanService } from '@app/services/architectural-plan.service';

@Component({
selector: 'app-file-metadata-list',
templateUrl: './file-metadata-list.component.html',
styleUrls: ['./file-metadata-list.component.scss']
})
export class FileMetadataListComponent implements OnInit {
@Input() projectId: string = '';

files: InputFileMetadata[] = [];
loading: boolean = true;
error: string | null = null;

// Enum references for template
DocumentType = DocumentType;
ProcessingStatus = ProcessingStatus;

constructor(private planService: ArchitecturalPlanService) {}

ngOnInit(): void {
this.loadFileMetadata();
}

private loadFileMetadata(): void {
this.loading = true;
this.error = null;

this.planService.listInputFileMetadata(this.projectId)
.subscribe({
next: (response) => {
this.files = response.files;
this.loading = false;
},
error: (err) => {
this.error = 'Failed to load file metadata';
this.loading = false;
console.error('Error loading file metadata:', err);
}
});
}

getDocumentTypeLabel(type: DocumentType): string {
switch (type) {
case DocumentType.DOCUMENT_TYPE_ARCHITECTURAL_PLAN:
return 'Architectural Plan';
case DocumentType.DOCUMENT_TYPE_ELECTRICAL_PLAN:
return 'Electrical Plan';
case DocumentType.DOCUMENT_TYPE_MECHANICAL_PLAN:
return 'Mechanical Plan';
case DocumentType.DOCUMENT_TYPE_STRUCTURAL_PLAN:
return 'Structural Plan';
case DocumentType.DOCUMENT_TYPE_INSPECTOR_FEEDBACK:
return 'Inspector Feedback';
case DocumentType.DOCUMENT_TYPE_PERMIT_APPLICATION:
return 'Permit Application';
case DocumentType.DOCUMENT_TYPE_SITE_PLAN:
return 'Site Plan';
case DocumentType.DOCUMENT_TYPE_ELEVATION_DRAWING:
return 'Elevation Drawing';
case DocumentType.DOCUMENT_TYPE_SECTION_DRAWING:
return 'Section Drawing';
default:
return 'Unknown';
}
}

getProcessingStatusLabel(status: ProcessingStatus): string {
switch (status) {
case ProcessingStatus.PROCESSING_STATUS_UPLOADED:
return 'Uploaded';
case ProcessingStatus.PROCESSING_STATUS_PROCESSING:
return 'Processing';
case ProcessingStatus.PROCESSING_STATUS_COMPLETED:
return 'Completed';
case ProcessingStatus.PROCESSING_STATUS_FAILED:
return 'Failed';
default:
return 'Unknown';
}
}

getProcessingStatusColor(status: ProcessingStatus): string {
switch (status) {
case ProcessingStatus.PROCESSING_STATUS_COMPLETED:
return 'success';
case ProcessingStatus.PROCESSING_STATUS_PROCESSING:
return 'primary';
case ProcessingStatus.PROCESSING_STATUS_FAILED:
return 'warn';
default:
return 'accent';
}
}

formatFileSize(bytes: number): string {
if (bytes === 0) return '0 B';
const k = 1024;
const sizes = ['B', 'KB', 'MB', 'GB'];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];
}

formatDate(isoDate: string): string {
if (!isoDate) return 'N/A';
return new Date(isoDate).toLocaleDateString();
}
}

Template: file-metadata-list.component.html

<div class="file-metadata-list">
<h3>Project Files</h3>

<div *ngIf="loading" class="loading-spinner">
<mat-spinner diameter="40"></mat-spinner>
<p>Loading file metadata...</p>
</div>

<div *ngIf="error" class="error-message">
<mat-icon>error</mat-icon>
<p>{{ error }}</p>
</div>

<div *ngIf="!loading && !error && files.length === 0" class="empty-state">
<mat-icon>description</mat-icon>
<p>No files found in this project.</p>
</div>

<mat-card *ngFor="let file of files" class="file-card">
<mat-card-header>
<mat-icon mat-card-avatar>description</mat-icon>
<mat-card-title>{{ file.fileName }}</mat-card-title>
<mat-card-subtitle>{{ getDocumentTypeLabel(file.documentType) }}</mat-card-subtitle>
</mat-card-header>

<mat-card-content>
<div class="file-details">
<div class="detail-row">
<span class="label">File ID:</span>
<span class="value">{{ file.fileId }}</span>
</div>

<div class="detail-row">
<span class="label">Size:</span>
<span class="value">{{ formatFileSize(file.fileSizeBytes) }}</span>
</div>

<div class="detail-row">
<span class="label">Pages:</span>
<span class="value">{{ file.pageCount }}</span>
</div>

<div class="detail-row">
<span class="label">Upload Date:</span>
<span class="value">{{ formatDate(file.uploadDate) }}</span>
</div>

<div class="detail-row">
<span class="label">Status:</span>
<mat-chip [color]="getProcessingStatusColor(file.processingStatus)" selected>
{{ getProcessingStatusLabel(file.processingStatus) }}
</mat-chip>
</div>

<div *ngIf="file.contentSummary" class="detail-row">
<span class="label">Summary:</span>
<p class="summary">{{ file.contentSummary }}</p>
</div>
</div>
</mat-card-content>

<mat-card-actions align="end">
<button mat-button color="primary">View Pages</button>
<button mat-button>Reprocess</button>
</mat-card-actions>
</mat-card>
</div>

2. Hierarchical Page Navigation Component​

Purpose: Display pages organized hierarchically by source file in table of contents

Location: web-ng-m3/src/app/components/project/pages/page-toc-hierarchical/page-toc-hierarchical.component.ts

UI Pattern: Mirrors the compliance tab's collapsible section hierarchy

import { Component, Input, Output, EventEmitter, OnInit } from '@angular/core';
import { InputFileMetadata, DocumentType } from '@generated/api_pb';
import { ArchitecturalPlanService } from '@app/services/architectural-plan.service';
import { trigger, state, style, transition, animate } from '@angular/animations';

export interface PageTreeNode {
type: 'file' | 'page';
fileId?: string;
fileName?: string;
documentType?: DocumentType;
pageCount?: number;
pageNumber?: number;
pageTitle?: string;
depth: number;
children?: PageTreeNode[];
}

@Component({
selector: 'app-page-toc-hierarchical',
templateUrl: './page-toc-hierarchical.component.html',
styleUrls: ['./page-toc-hierarchical.component.scss'],
animations: [
trigger('rowAnimation', [
transition(':enter', [
style({ height: '0px', opacity: 0, transform: 'translateY(-10px)', overflow: 'hidden' }),
animate('150ms ease-out', style({ height: '*', opacity: 1, transform: 'translateY(0)' }))
]),
transition(':leave', [
animate('150ms ease-in', style({ height: '0px', opacity: 0, transform: 'translateY(-10px)', overflow: 'hidden' }))
])
])
]
})
export class PageTocHierarchicalComponent implements OnInit {
@Input() projectId: string = '';
@Input() selectedPageNumber: number | null = null;
@Output() pageSelected = new EventEmitter<number>();

treeNodes: PageTreeNode[] = [];
expandedFileIds = new Set<string>();
loading: boolean = true;

constructor(private planService: ArchitecturalPlanService) {}

ngOnInit(): void {
this.loadHierarchy();
}

private async loadHierarchy(): Promise<void> {
this.loading = true;

try {
// Load file metadata
const response = await this.planService.listInputFileMetadata(this.projectId).toPromise();
const files = response?.files || [];

// Load plan pages
const plan = await this.planService.getArchitecturalPlan(this.projectId).toPromise();
const pages = plan?.pages || [];

// Build tree structure
this.treeNodes = files.map(file => ({
type: 'file' as const,
fileId: file.fileId,
fileName: file.fileName,
documentType: file.documentType,
pageCount: file.pageCount,
depth: 0,
children: pages
.filter(page => this.pagesBelongsToFile(page, file))
.map(page => ({
type: 'page' as const,
pageNumber: page.pageNumber,
pageTitle: page.title,
depth: 1
}))
}));

// Expand all by default
files.forEach(file => this.expandedFileIds.add(file.fileId));

} catch (error) {
console.error('Error loading hierarchy:', error);
} finally {
this.loading = false;
}
}

isExpanded(fileId: string): boolean {
return this.expandedFileIds.has(fileId);
}

toggleFile(fileId: string): void {
if (this.expandedFileIds.has(fileId)) {
this.expandedFileIds.delete(fileId);
} else {
this.expandedFileIds.add(fileId);
}
}

expandAll(): void {
this.treeNodes.forEach(node => {
if (node.fileId) {
this.expandedFileIds.add(node.fileId);
}
});
}

collapseAll(): void {
this.expandedFileIds.clear();
}

selectPage(pageNumber: number): void {
this.selectedPageNumber = pageNumber;
this.pageSelected.emit(pageNumber);
}

getNodePadding(depth: number): string {
return `${depth * 24}px`;
}

getDocumentTypeIcon(type: DocumentType): string {
switch (type) {
case DocumentType.DOCUMENT_TYPE_ARCHITECTURAL_PLAN:
return 'architecture';
case DocumentType.DOCUMENT_TYPE_ELECTRICAL_PLAN:
return 'electrical_services';
case DocumentType.DOCUMENT_TYPE_MECHANICAL_PLAN:
return 'hvac';
case DocumentType.DOCUMENT_TYPE_STRUCTURAL_PLAN:
return 'foundation';
default:
return 'description';
}
}

private pagesBelongsToFile(page: any, file: InputFileMetadata): boolean {
// For now, check if page number is in extracted_pages
// This will be more sophisticated once migration is complete
return file.extractedPages?.includes(page.pageNumber.toString()) || false;
}
}

Template: page-toc-hierarchical.component.html

<div class="page-toc-hierarchical">
<div class="toc-header">
<h3>Table of Contents</h3>
<div class="toc-actions">
<button mat-icon-button (click)="expandAll()" title="Expand All">
<mat-icon>unfold_more</mat-icon>
</button>
<button mat-icon-button (click)="collapseAll()" title="Collapse All">
<mat-icon>unfold_less</mat-icon>
</button>
</div>
</div>

<div *ngIf="loading" class="loading-state">
<mat-spinner diameter="30"></mat-spinner>
</div>

<div *ngIf="!loading" class="toc-tree">
<ng-container *ngFor="let node of treeNodes">
<!-- File Node (Parent) -->
<div class="tree-node file-node"
[style.padding-left]="getNodePadding(node.depth)"
(click)="toggleFile(node.fileId!)"
[@rowAnimation]>
<mat-icon class="expand-icon">
{{ isExpanded(node.fileId!) ? 'expand_more' : 'chevron_right' }}
</mat-icon>
<mat-icon class="file-icon">{{ getDocumentTypeIcon(node.documentType!) }}</mat-icon>
<span class="file-name">{{ node.fileName }}</span>
<mat-chip class="document-type-chip" size="small">
{{ getDocumentTypeLabel(node.documentType!) }}
</mat-chip>
<span class="page-count">{{ node.pageCount }} pages</span>
</div>

<!-- Page Nodes (Children) - only shown when file is expanded -->
<ng-container *ngIf="isExpanded(node.fileId!)">
<div *ngFor="let child of node.children"
class="tree-node page-node"
[class.selected]="child.pageNumber === selectedPageNumber"
[style.padding-left]="getNodePadding(child.depth)"
(click)="selectPage(child.pageNumber!)"
[@rowAnimation]>
<span class="expander-placeholder"></span>
<mat-icon class="page-icon">article</mat-icon>
<span class="page-label">Page {{ child.pageNumber }}: {{ child.pageTitle }}</span>
</div>
</ng-container>
</ng-container>
</div>
</div>

Styling (similar to compliance tab):

.page-toc-hierarchical {
.tree-node {
display: flex;
align-items: center;
padding: 8px;
cursor: pointer;
transition: background-color 150ms ease;

&:hover {
background-color: rgba(0, 0, 0, 0.04);
}

&.selected {
background-color: rgba(63, 81, 181, 0.1);
border-left: 3px solid #3f51b5;
}
}

.file-node {
font-weight: 500;
border-bottom: 1px solid rgba(0, 0, 0, 0.12);
}

.page-node {
font-weight: 400;
}

.expand-icon {
margin-right: 8px;
}

.expander-placeholder {
display: inline-block;
width: 32px;
}

.file-icon, .page-icon {
margin-right: 8px;
color: rgba(0, 0, 0, 0.54);
}
}

Integration: This component replaces or enhances the existing flat page list in the TOC sidebar.

3. Legacy Project Upgrade Banner Component​

Purpose: Prompt users to upgrade legacy projects

Location: web-ng-m3/src/app/components/project/settings/legacy-upgrade-banner/legacy-upgrade-banner.component.ts

import { Component, Input, OnInit, Output, EventEmitter } from '@angular/core';
import { MatDialog } from '@angular/material/dialog';
import { ArchitecturalPlanService } from '@app/services/architectural-plan.service';
import { FileStructureMigrationDialogComponent } from './file-structure-migration-dialog.component';

@Component({
selector: 'app-legacy-upgrade-banner',
templateUrl: './legacy-upgrade-banner.component.html',
styleUrls: ['./legacy-upgrade-banner.component.scss']
})
export class LegacyUpgradeBannerComponent implements OnInit {
@Input() projectId: string = '';
@Output() upgraded = new EventEmitter<void>();

isLegacyProject: boolean = false;
showBanner: boolean = false;
checking: boolean = true;

constructor(
private planService: ArchitecturalPlanService,
private dialog: MatDialog
) {}

ngOnInit(): void {
this.checkIfLegacyProject();
}

private checkIfLegacyProject(): void {
this.checking = true;

this.planService.analyzeProjectMigration(this.projectId)
.subscribe({
next: (analysis) => {
this.isLegacyProject = analysis.needsMigration;
this.showBanner = this.isLegacyProject && !this.isDismissed();
this.checking = false;
},
error: (err) => {
console.error('Error checking project version:', err);
this.checking = false;
}
});
}

openUpgradeDialog(): void {
const dialogRef = this.dialog.open(FileStructureMigrationDialogComponent, {
width: '600px',
data: { projectId: this.projectId }
});

dialogRef.afterClosed().subscribe(result => {
if (result === 'upgraded') {
this.showBanner = false;
this.upgraded.emit();
}
});
}

dismissBanner(): void {
this.showBanner = false;
this.markAsDismissed();
}

private isDismissed(): boolean {
const key = `legacy-upgrade-dismissed-${this.projectId}`;
return localStorage.getItem(key) === 'true';
}

private markAsDismissed(): void {
const key = `legacy-upgrade-dismissed-${this.projectId}`;
localStorage.setItem(key, 'true');
}
}

Template: legacy-upgrade-banner.component.html

<mat-card *ngIf="showBanner" class="legacy-upgrade-banner" appearance="outlined">
<mat-card-content>
<div class="banner-content">
<mat-icon class="info-icon">info</mat-icon>
<div class="banner-text">
<h3>Upgrade Available</h3>
<p>
Upgrade your project to the new file structure for better organization,
rich file metadata, and improved search capabilities.
</p>
</div>
<div class="banner-actions">
<button mat-raised-button color="primary" (click)="openUpgradeDialog()">
Upgrade Project
</button>
<button mat-button (click)="dismissBanner()">Dismiss</button>
</div>
</div>
</mat-card-content>
</mat-card>

CLI Tools​

Bulk Upgrade Command​

Purpose: Admin tool for bulk upgrading legacy projects

Location: cli/codeproof.sh upgrade-file-structure

#!/bin/bash
# Bulk upgrade legacy projects to new file structure

set -e

# Configuration
GRPC_HOST="${GRPC_HOST:-localhost:8080}"
PROTO_PATH="src/main/proto"
GOOGLEAPIS_PATH="env/dependencies/googleapis"

# Parse arguments
DRY_RUN="false"
USER_ID=""
PROJECT_IDS=""
ALL="false"

while [[ $# -gt 0 ]]; do
case $1 in
--dry-run)
DRY_RUN="$2"
shift 2
;;
--user-id)
USER_ID="$2"
shift 2
;;
--project-ids)
PROJECT_IDS="$2"
shift 2
;;
--all)
ALL="true"
shift
;;
*)
echo "Unknown option: $1"
exit 1
;;
esac
done

# Validate inputs
if [ -z "$USER_ID" ]; then
echo "Error: --user-id is required"
exit 1
fi

echo "=========================================="
echo "File Structure Bulk Upgrade Tool"
echo "=========================================="
echo "User ID: $USER_ID"
echo "Dry Run: $DRY_RUN"
echo "All Projects: $ALL"
echo ""

# Function to migrate a single project
migrate_project() {
local project_id=$1

echo "Migrating project: $project_id"

RESPONSE=$(grpcurl -plaintext \
-import-path "${PROTO_PATH}" \
-import-path "${GOOGLEAPIS_PATH}" \
-proto "${PROTO_PATH}/api.proto" \
-d '{
"project_id": "'"${project_id}"'",
"preserve_legacy_structure": true,
"dry_run": '"${DRY_RUN}"',
"initiated_by": "'"${USER_ID}"'"
}' \
"${GRPC_HOST}" \
org.codetricks.construction.code.assistant.service.ArchitecturalPlanService/MigrateProjectFileStructure)

echo "$RESPONSE" | jq .

SUCCESS=$(echo "$RESPONSE" | jq -r '.success')
if [ "$SUCCESS" == "true" ]; then
echo "βœ… Successfully migrated: $project_id"
else
ERROR=$(echo "$RESPONSE" | jq -r '.error_message')
echo "❌ Failed to migrate $project_id: $ERROR"
fi

echo ""
}

# Get list of projects to migrate
if [ "$ALL" == "true" ]; then
echo "Fetching all projects for user..."
LIST_RESPONSE=$(grpcurl -plaintext \
-import-path "${PROTO_PATH}" \
-import-path "${GOOGLEAPIS_PATH}" \
-proto "${PROTO_PATH}/api.proto" \
-d '{"account_id": "'"${USER_ID}"'"}' \
"${GRPC_HOST}" \
org.codetricks.construction.code.assistant.service.ArchitecturalPlanService/ListArchitecturalPlanIds)

PROJECT_IDS=$(echo "$LIST_RESPONSE" | jq -r '.architectural_plan_ids[]')
fi

# Convert comma-separated to array if needed
IFS=',' read -ra PROJECTS <<< "$PROJECT_IDS"

# Migrate each project
TOTAL=${#PROJECTS[@]}
SUCCESS_COUNT=0
FAIL_COUNT=0

echo "Found $TOTAL projects to migrate"
echo ""

for project_id in "${PROJECTS[@]}"; do
migrate_project "$project_id"

# Check if successful
if [ $? -eq 0 ]; then
((SUCCESS_COUNT++))
else
((FAIL_COUNT++))
fi
done

echo "=========================================="
echo "Migration Complete"
echo "=========================================="
echo "Total Projects: $TOTAL"
echo "Successful: $SUCCESS_COUNT"
echo "Failed: $FAIL_COUNT"
echo "=========================================="

Testing Strategy​

1. Unit Tests​

Test Coverage:

  • ProjectPathResolver: Path resolution logic, caching, fallback
  • InputFileMetadataService: Metadata generation, persistence
  • FileStructureMigrationService: Migration logic, page association
  • DocumentClassificationService: Heuristic classification rules

Example Test (ProjectPathResolverTest.java):

@Test
public void testResolvePagePath_NewStructureExists_ReturnsNewPath() {
// Arrange
String projectId = "test-project";
int pageNumber = 1;
String expectedPath = "projects/test-project/files/file-123/pages/001/";

when(fileSystemHandler.exists(expectedPath)).thenReturn(true);

// Act
String actualPath = dualReadHandler.resolvePageFolderPath(projectId, pageNumber);

// Assert
assertEquals(expectedPath, actualPath);
verify(fileSystemHandler).exists(expectedPath);
}

@Test
public void testResolvePagePath_LegacyFallback_ReturnsLegacyPath() {
// Arrange
String projectId = "legacy-project";
int pageNumber = 1;
String newPath = "projects/legacy-project/files/.../pages/001/";
String legacyPath = "projects/legacy-project/pages/001/";

when(fileSystemHandler.exists(newPath)).thenReturn(false);
when(fileSystemHandler.exists(legacyPath)).thenReturn(true);

// Act
String actualPath = dualReadHandler.resolvePageFolderPath(projectId, pageNumber);

// Assert
assertEquals(legacyPath, actualPath);
}

2. Integration Tests​

Test Scenarios:

  1. Legacy Project Read: Verify existing functionality unchanged
  2. Modern Project Read: Verify new structure works
  3. Migration End-to-End: Migrate project, verify pages accessible
  4. Rollback Safety: Ensure legacy structure preserved

Example Test (FileStructureMigrationIntegrationTest.java):

@Test
public void testMigrateProject_LegacyToModern_Success() {
// Arrange: Create legacy project
String projectId = createLegacyTestProject();

// Act: Migrate project
MigrationResult result = migrationService.migrateProject(
projectId, true /* preserve legacy */, false /* not dry run */);

// Assert: Migration successful
assertTrue(result.success);
assertEquals(3, result.totalPagesMigrated);
assertTrue(result.migratedFiles.size() > 0);

// Assert: Pages readable from new structure
for (int i = 1; i <= 3; i++) {
String path = dualReadHandler.resolvePageFolderPath(projectId, i);
assertTrue(path.contains("/files/"));
}

// Assert: Legacy structure still exists
String legacyPath = "projects/" + projectId + "/pages/";
assertTrue(fileSystemHandler.exists(legacyPath));
}

3. Backward Compatibility Tests​

Critical Tests:

  • Legacy project reads work unchanged
  • All existing API calls return correct data
  • Performance not degraded for legacy projects
  • Legacy plan-metadata.json still updated

Test Matrix:

Project TypeRead PagesWrite PagesList FilesGet Metadata
Legacyβœ… Passβœ… Passβœ… Passβœ… Pass
Transitionalβœ… Passβœ… Passβœ… Passβœ… Pass
Modernβœ… Passβœ… Passβœ… Passβœ… Pass

Refactoring: Path Resolution Consolidation​

Problem Statement​

Currently, ArchitecturalPlanReviewer contains 15+ path-related methods that:

  1. Assume legacy flat structure (pages/{pageNum}/)
  2. Mix concerns (domain logic + path utilities)
  3. Have duplicate static/instance method pairs
  4. No support for new hierarchical structure (files/{fileId}/pages/)
  5. Unclear class responsibility: Is it project-level or file-level?

Current Path Methods in ArchitecturalPlanReviewer​

// Project-level paths
public static String getDefaultProjectHomeDir(String planId)
public String getProjectHomeDir(String planId)
public String getProjectHomeDir()
public static String getDefaultProjectsRootFolder()
public String getProjectsRootFolder()

// Legacy page paths (flat structure)
public static String getProjectPagesBasePath(String planId)
public String getPageFolderPath(int pageNumber)
public static String getPageFolderPath(String planId, int pageNumber)

// Metadata file paths
public String getPlanPageMetadataFilePath(int pageNumber)
public static String getPlanPageMetadataFilePath(String planId, int pageNumber)
public String getArchitecturalPlanMetadataFilePath()
public static String getArchitecturalPlanMetadataFilePath(String planId)

// Other paths
private String getProjectOverviewPath()
private String getFullProjectContentPath()
public String getProjectSourcePdfPath()

Issues:

  • ❌ No file_id support
  • ❌ Hardcoded legacy structure
  • ❌ Scattered across business logic class
  • ❌ Static methods don't have access to ProjectPathResolver instance

Semantic Clarity: What is ArchitecturalPlanReviewer?​

Current Reality:

  • Name suggests "single plan" (one file)
  • Implementation is project-scoped (has planId, loads all pages in project)
  • Historically: 1 plan = 1 project (no ambiguity)
  • Future: 1 project = N files (plans, electricals, mechanicals, inspector feedback)

Design Decision: Keep ArchitecturalPlanReviewer as project-scoped with optional file filtering.

Rationale:

  1. Minimal Breaking Changes: Existing code expects project-level operations
  2. Backward Compatible: Can operate on entire project (legacy) or single file (modern)
  3. Incremental Evolution: Can split into file/project classes later if needed
  4. Naming: "Plan" historically meant "project" in our domain

Proposed Architecture​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Service Layer β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ArchitecturalPlanServiceImpl β”‚ β”‚
β”‚ β”‚ (gRPC service, orchestrates reviewers) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Business Logic Layer β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ArchitecturalPlanReviewer β”‚ β”‚
β”‚ β”‚ - Operates at PROJECT level by default β”‚ β”‚
β”‚ β”‚ - Optional fileId filter for single-file mode β”‚ β”‚
β”‚ β”‚ - Delegates ALL path logic to ProjectPathResolver β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Utility/Helper Layer β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ ProjectPathResolver β”‚ β”‚
β”‚ β”‚ - ALL path construction logic β”‚ β”‚
β”‚ β”‚ - Supports modern & legacy structures β”‚ β”‚
β”‚ β”‚ - Dual-read with file_id optimization β”‚ β”‚
β”‚ β”‚ - Caching for performance β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Infrastructure Layer β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ FileSystemHandler (GCS/Local) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Refactoring Plan: 3 Phases​

Phase 1: Extract Path Logic to ProjectPathResolver βœ…β€‹

Goal: Centralize ALL path construction logic in ProjectPathResolver

New Methods in ProjectPathResolver:

public class ProjectPathResolver {

private final FileSystemHandler fileSystemHandler;
private final String projectsRootFolder; // "projects" by default

// Constructor
public ProjectPathResolver(FileSystemHandler fileSystemHandler) {
this(fileSystemHandler, "projects");
}

public ProjectPathResolver(FileSystemHandler fileSystemHandler, String projectsRootFolder) {
this.fileSystemHandler = fileSystemHandler;
this.projectsRootFolder = projectsRootFolder;
}

// ========================================
// Project-Level Paths
// ========================================

/**
* Returns the projects root folder path.
* @return "projects" by default
*/
public String getProjectsRootFolder() {
return projectsRootFolder;
}

/**
* Returns the project home directory path.
* @param projectId The project ID
* @return "projects/{projectId}"
*/
public String getProjectHomeDir(String projectId) {
return projectsRootFolder + "/" + projectId;
}

/**
* Returns the project inputs folder path.
* @param projectId The project ID
* @return "projects/{projectId}/inputs"
*/
public String getProjectInputsPath(String projectId) {
return getProjectHomeDir(projectId) + "/inputs";
}

/**
* Returns the plan metadata file path (plan-metadata.json).
* @param projectId The project ID
* @return "projects/{projectId}/plan-metadata.json"
*/
public String getPlanMetadataFilePath(String projectId) {
return getProjectHomeDir(projectId) + "/plan-metadata.json";
}

/**
* Returns the project metadata file path (project-metadata.json).
* @param projectId The project ID
* @return "projects/{projectId}/project-metadata.json"
*/
public String getProjectMetadataFilePath(String projectId) {
return getProjectHomeDir(projectId) + "/project-metadata.json";
}

/**
* Returns the project overview file path.
* @param projectId The project ID
* @return "projects/{projectId}/overview.md"
*/
public String getProjectOverviewPath(String projectId) {
return getProjectHomeDir(projectId) + "/overview.md";
}

/**
* Returns the full project content file path.
* @param projectId The project ID
* @return "projects/{projectId}/project-content.md"
*/
public String getFullProjectContentPath(String projectId) {
return getProjectHomeDir(projectId) + "/project-content.md";
}

// ========================================
// File-Level Paths (Modern Structure)
// ========================================

/**
* Returns the files folder path.
* @param projectId The project ID
* @return "projects/{projectId}/files"
*/
public String getFilesBasePath(String projectId) {
return getProjectHomeDir(projectId) + "/files";
}

/**
* Returns the file index path.
* @param projectId The project ID
* @return "projects/{projectId}/files/index.json"
*/
public String getFileIndexPath(String projectId) {
return getFilesBasePath(projectId) + "/index.json";
}

/**
* Returns the file folder path.
* @param projectId The project ID
* @param fileId The file ID (e.g., "1", "2", "3")
* @return "projects/{projectId}/files/{fileId}"
*/
public String getFileFolderPath(String projectId, String fileId) {
return getFilesBasePath(projectId) + "/" + fileId;
}

/**
* Returns the file metadata path.
* @param projectId The project ID
* @param fileId The file ID
* @return "projects/{projectId}/files/{fileId}/metadata.json"
*/
public String getFileMetadataPath(String projectId, String fileId) {
return getFileFolderPath(projectId, fileId) + "/metadata.json";
}

/**
* Returns the file pages folder path.
* @param projectId The project ID
* @param fileId The file ID
* @return "projects/{projectId}/files/{fileId}/pages"
*/
public String getFilePagesBasePath(String projectId, String fileId) {
return getFileFolderPath(projectId, fileId) + "/pages";
}

// ========================================
// Page-Level Paths (Dual-Read Support)
// ========================================

/**
* Returns the legacy pages folder path.
* @param projectId The project ID
* @return "projects/{projectId}/pages"
*/
public String getLegacyPagesBasePath(String projectId) {
return getProjectHomeDir(projectId) + "/pages";
}

/**
* Returns the page folder path with optional file_id.
*
* <p><b>Path Resolution Strategy:</b>
* <ul>
* <li>If fileId provided: Direct modern path (FAST)</li>
* <li>If fileId null: Dual-read logic (cache β†’ modern β†’ legacy)</li>
* </ul>
*
* @param projectId The project ID
* @param pageNumber The page number (1-based)
* @param fileId Optional file ID for direct access (null for auto-detect)
* @return Resolved page folder path
* @throws PageNotFoundException if page doesn't exist in either structure
*/
public String resolvePageFolderPath(String projectId, int pageNumber, String fileId)
throws PageNotFoundException {
// Implementation already covered earlier in this TDD
// ... (see lines 699-745)
}

/**
* Returns the page metadata file path.
* @param projectId The project ID
* @param pageNumber The page number
* @param fileId Optional file ID
* @return "projects/{projectId}/files/{fileId}/pages/{pageNum}/metadata.json"
* or "projects/{projectId}/pages/{pageNum}/metadata.json" (legacy)
*/
public String getPageMetadataPath(String projectId, int pageNumber, String fileId)
throws PageNotFoundException {
String pageFolderPath = resolvePageFolderPath(projectId, pageNumber, fileId);
return pageFolderPath + "/metadata.json";
}

/**
* Returns the page PDF file path.
* @param projectId The project ID
* @param pageNumber The page number
* @param fileId Optional file ID
* @return Path to page.pdf
*/
public String getPagePdfPath(String projectId, int pageNumber, String fileId)
throws PageNotFoundException {
String pageFolderPath = resolvePageFolderPath(projectId, pageNumber, fileId);
return pageFolderPath + "/page.pdf";
}

/**
* Returns the page markdown file path.
* @param projectId The project ID
* @param pageNumber The page number
* @param fileId Optional file ID
* @return Path to page.md
*/
public String getPageMarkdownPath(String projectId, int pageNumber, String fileId)
throws PageNotFoundException {
String pageFolderPath = resolvePageFolderPath(projectId, pageNumber, fileId);
return pageFolderPath + "/page.md";
}

// ========================================
// Utility Methods
// ========================================

/**
* Checks if project uses modern file structure.
* @param projectId The project ID
* @return true if files/ directory exists
*/
public boolean isModernStructure(String projectId) throws IOException {
return fileSystemHandler.exists(getFilesBasePath(projectId));
}

/**
* Checks if project uses legacy structure.
* @param projectId The project ID
* @return true if pages/ directory exists but files/ doesn't
*/
public boolean isLegacyStructure(String projectId) throws IOException {
String filesPath = getFilesBasePath(projectId);
String pagesPath = getLegacyPagesBasePath(projectId);
return fileSystemHandler.exists(pagesPath) && !fileSystemHandler.exists(filesPath);
}

/**
* Atomically increments and returns the next file ID for a project.
* Thread-safe for concurrent file uploads using optimistic locking (CAS).
*
* @param projectId The project ID
* @return The assigned file ID (guaranteed unique within project)
* @throws IOException if max retries exceeded or I/O error
*/
public int getAndIncrementFileId(String projectId) throws IOException {
String indexPath = getFileIndexPath(projectId);
int maxRetries = 10;

for (int attempt = 0; attempt < maxRetries; attempt++) {
try {
// Read current index with version (GCS generation number)
Long currentVersion = fileSystemHandler.getFileVersion(indexPath);

JSONObject index;
if (currentVersion == null) {
// File doesn't exist - initialize new index
index = new JSONObject();
index.put("next_file_id", 1);
index.put("files", new JSONArray());
} else {
// File exists - read and parse
String content = fileSystemHandler.readFile(indexPath);
index = new JSONObject(content);
}

// Get current ID and increment for next time
int assignedId = index.optInt("next_file_id", 1);
index.put("next_file_id", assignedId + 1);

// Atomic write: Only succeeds if version matches
try {
fileSystemHandler.writeFileAtomic(indexPath, index.toString(2), currentVersion);
return assignedId;
} catch (AtomicWriteConflictException e) {
// Another thread modified - retry with exponential backoff
Thread.sleep(50 + (attempt * 10));
continue;
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new IOException("Interrupted while assigning file ID", e);
}
}

throw new IOException("Failed to assign file ID after " + maxRetries + " retries");
}
}

FileSystemHandler Atomic Operations​

New Abstract Methods for thread-safe file ID generation:

/**
* Atomically writes a file if the expected version/generation matches.
* Provides compare-and-set (CAS) semantics for concurrent-safe updates.
*
* @param path The file path
* @param content The content to write
* @param expectedVersion The expected version/generation (null for "must not exist")
* @return The new version/generation after write
* @throws AtomicWriteConflictException if version doesn't match
* @throws IOException if there's an I/O error
*/
public abstract long writeFileAtomic(String path, String content, Long expectedVersion)
throws IOException, AtomicWriteConflictException;

/**
* Gets the current version/generation of a file.
*
* @param path The file path
* @return The current version/generation, or null if file doesn't exist
* @throws IOException if there's an error accessing the file
*/
public abstract Long getFileVersion(String path) throws IOException;

GcsFileSystemHandler Implementation:

  • Uses native GCS generation numbers
  • BlobTargetOption.generationMatch(expectedVersion) for CAS
  • BlobTargetOption.doesNotExist() for new files
  • Returns HTTP 412 on conflicts β†’ AtomicWriteConflictException

LocalFileSystemHandler Implementation:

  • Uses last modified time as version (milliseconds)
  • Synchronized locks per file path (single-instance only)
  • Note: Doesn't scale horizontally (use GCS in production)

AtomicWriteConflictException:

  • Custom exception for CAS failures
  • Contains: path, expectedVersion, actualVersion
  • Signals retry needed in getAndIncrementFileId()

Phase 2: Update ArchitecturalPlanReviewer​

Goal: Delegate all path logic to ProjectPathResolver, add optional fileId support

Changes to ArchitecturalPlanReviewer:

public class ArchitecturalPlanReviewer {

private final String planId; // Actually projectId
private final String projectsRootFolder;
private final FileSystemHandler fileSystemHandler;

// NEW: ProjectPathResolver instance
private final ProjectPathResolver pathResolver;

// NEW: Optional file ID for single-file mode
private final String fileId; // null for project-wide mode

// Constructor with optional fileId
public ArchitecturalPlanReviewer(
String planId,
FileSystemHandler fileSystemHandler,
String projectSourcePdfPath,
ModelClient modelClient,
List<Integer> pageList,
boolean forceReprocess,
boolean enableOrientationDetection,
List<String> iccDocumentIds,
ProgressCallback progressCallback,
String projectsRootFolder,
String fileId) throws IOException { // NEW parameter

this.planId = planId;
this.projectsRootFolder = projectsRootFolder;
this.fileSystemHandler = fileSystemHandler;
this.fileId = fileId; // NEW field

// NEW: Initialize ProjectPathResolver
this.pathResolver = new ProjectPathResolver(fileSystemHandler, projectsRootFolder);

// ... rest of initialization
}

// ========================================
// Updated Path Methods (delegate to ProjectPathResolver)
// ========================================

public String getProjectHomeDir() {
return pathResolver.getProjectHomeDir(planId);
}

public static String getDefaultProjectHomeDir(String planId) {
// Backward compatibility: use default root folder
ProjectPathResolver resolver = new ProjectPathResolver(
FileSystemHandlerFactory.createDefaultFileSystemHandler());
return resolver.getProjectHomeDir(planId);
}

public String getPageFolderPath(int pageNumber) throws PageNotFoundException {
return pathResolver.resolvePageFolderPath(planId, pageNumber, fileId);
}

public static String getPageFolderPath(String planId, int pageNumber) {
ProjectPathResolver resolver = new ProjectPathResolver(
FileSystemHandlerFactory.createDefaultFileSystemHandler());
try {
return resolver.resolvePageFolderPath(planId, pageNumber, null);
} catch (PageNotFoundException e) {
throw new RuntimeException(e);
}
}

public String getPlanPageMetadataFilePath(int pageNumber) throws PageNotFoundException {
return pathResolver.getPageMetadataPath(planId, pageNumber, fileId);
}

private String getArchitecturalPlanMetadataFilePath() {
return pathResolver.getPlanMetadataFilePath(planId);
}

private String getProjectOverviewPath() {
return pathResolver.getProjectOverviewPath(planId);
}

private String getFullProjectContentPath() {
return pathResolver.getFullProjectContentPath(planId);
}

// ... all other path methods delegate to pathResolver

// ========================================
// Optional: Convenience Getters
// ========================================

public String getFileId() {
return fileId;
}

public boolean isSingleFileMode() {
return fileId != null && !fileId.isEmpty();
}

public ProjectPathResolver getPathResolver() {
return pathResolver;
}
}

Backward Compatibility:

  • All existing constructors remain unchanged
  • New fileId parameter added to most flexible constructor only
  • Default behavior (fileId=null) maintains current project-wide operation
  • Static methods continue to work via temporary ProjectPathResolver instances

Phase 3: Service Layer Updates​

Goal: Update gRPC service implementations to extract and pass file_id to ArchitecturalPlanReviewer

Key Services to Update:

  • ArchitecturalPlanServiceImpl (main facade)
  • ArchitecturalPlanReviewServiceImpl (compliance analysis)
  • ArchitecturalPlanAnalysisServiceImpl (analysis availability)
  • ComplianceReportAsyncServiceImpl (async tasks)

Example: ArchitecturalPlanServiceImpl:

public class ArchitecturalPlanServiceImpl {

@Override
public PageApplicabilityAnalysisList getApplicableCodeSections(
GetApplicableCodeSectionsRequest request) {

String projectId = request.getArchitecturalPlanId();
int pageNumber = request.getPageNumber();
String fileId = request.getFileId(); // From updated proto

// Create reviewer with optional fileId
// If fileId provided, reviewer operates in single-file mode
// If fileId null, reviewer operates in project-wide mode (legacy)
ArchitecturalPlanReviewer reviewer = createReviewer(projectId, fileId);

// Reviewer automatically uses fileId for path resolution
// ...
}

private ArchitecturalPlanReviewer createReviewer(String projectId, String fileId) {
// Use constructor with fileId parameter
// ...
}
}

Phase 4: Frontend/UI Updates​

Goal: Update Angular frontend to track file_id and pass it in RPC requests

Key Components to Update:

  • API Service (api.service.ts) - Add file_id parameter to RPC methods
  • Compliance Component - Look up file_id from InputFileMetadata
  • Page Navigation Component - Track file-to-page mappings
  • File Metadata Service - Fetch and cache InputFileMetadata list

Implementation Pattern:

// 1. Fetch InputFileMetadata for the project
this.fileMetadataService.listInputFiles(projectId).subscribe(files => {
this.inputFiles = files;
});

// 2. Look up file_id for a given page number
private getFileIdForPage(pageNumber: number): string | undefined {
const fileMetadata = this.inputFiles.find(
f => f.extracted_pages.includes(String(pageNumber))
);
return fileMetadata?.file_id;
}

// 3. Pass file_id when making RPC calls
loadPageAnalysis(pageNumber: number) {
const fileId = this.getFileIdForPage(pageNumber);

this.apiService.getApplicableCodeSections(
this.projectId,
pageNumber,
this.iccBookId,
fileId // Pass file_id (undefined for legacy projects)
).subscribe(/* ... */);
}

Migration Path (Refactoring Phases)​

Phase 1: Backend Infrastructure (Week 1-2) βœ… COMPLETE

  • βœ… Implement ProjectPathResolver with all path methods
  • βœ… Add optional file_id to RPC proto definitions
  • βœ… Write comprehensive unit tests (40 tests)
  • βœ… Implement atomic getAndIncrementFileId() with GCS CAS
  • βœ… Add FileSystemHandler atomic operations

Phase 2: ArchitecturalPlanReviewer Refactoring (Week 3) βœ… COMPLETE

  • βœ… Update ArchitecturalPlanReviewer to use ProjectPathResolver
  • βœ… Add optional fileId parameter to constructor
  • βœ… Delegate all 8 path methods to pathResolver
  • βœ… Add static path builders (no FileSystemHandler overhead)
  • βœ… Maintain backward compatibility (all existing tests pass)

Phase 3: Service Layer Updates (Week 3-4) πŸ”œ NEXT

  • Update ArchitecturalPlanServiceImpl to pass fileId
  • Update ArchitecturalPlanReviewServiceImpl to pass fileId
  • Update ArchitecturalPlanAnalysisServiceImpl to pass fileId
  • Update ComplianceReportAsyncServiceImpl to pass fileId
  • Extract fileId from RPC requests and pass to reviewer constructor
  • Integration testing

Phase 4: Frontend/UI Updates (Week 4-5)

  • Update api.service.ts - Add file_id parameter to RPC methods
  • Update compliance component - Look up file_id from InputFileMetadata
  • Create file metadata service - Fetch and cache InputFileMetadata list
  • Update page navigation - Display hierarchical file tree
  • UI testing and polish

Benefits​

  1. Single Source of Truth: All path logic in one place
  2. DRY Principle: No duplication between static/instance methods
  3. Testability: Easy to mock ProjectPathResolver
  4. Flexibility: Supports modern, legacy, and hybrid structures
  5. Performance: Caching and optimization in one place
  6. Backward Compatible: Existing code continues to work
  7. Future-Proof: Easy to add new path types (e.g., reports/{fileId}/)

Testing Strategy​

Unit Tests for ProjectPathResolver:

@Test
public void testResolvePagePath_WithFileId_DirectPath() {
ProjectPathResolver resolver = new ProjectPathResolver(mockFileSystemHandler);
String path = resolver.resolvePageFolderPath("project-1", 5, "2");
assertEquals("projects/project-1/files/2/pages/005", path);
// Should NOT call fileSystemHandler (no filesystem checks)
}

@Test
public void testResolvePagePath_WithoutFileId_ModernStructure() throws Exception {
when(mockFileSystemHandler.exists("projects/project-1/files/")).thenReturn(true);
when(mockFileSystemHandler.listDirectories("projects/project-1/files/"))
.thenReturn(Arrays.asList("1", "2", "3"));
when(mockFileSystemHandler.exists("projects/project-1/files/2/pages/005/"))
.thenReturn(true);

String path = resolver.resolvePageFolderPath("project-1", 5, null);
assertEquals("projects/project-1/files/2/pages/005", path);
}

@Test
public void testResolvePagePath_WithoutFileId_LegacyFallback() throws Exception {
when(mockFileSystemHandler.exists("projects/project-1/files/")).thenReturn(false);
when(mockFileSystemHandler.exists("projects/project-1/pages/005/")).thenReturn(true);

String path = resolver.resolvePageFolderPath("project-1", 5, null);
assertEquals("projects/project-1/pages/005", path);
}

Open Questions for Discussion​

  1. Naming: Should we rename planId β†’ projectId throughout the codebase for clarity?

    • Recommendation: Yes, but as separate refactoring (Issue #XXX)
  2. Static Methods: Keep static methods in ArchitecturalPlanReviewer for backward compatibility?

    • Recommendation: Yes, but mark as @Deprecated after Phase 2
  3. File Index Performance: Should ProjectPathResolver cache files/index.json in memory?

    • Recommendation: Yes, with TTL of 5 minutes (balances freshness vs performance)
  4. Future Split: Should we eventually split ArchitecturalPlanReviewer into file/project classes?

    • Recommendation: Monitor usage patterns; split only if clear need emerges

Deployment Strategy​

Phase 1: Backend Infrastructure (Week 1)​

Deliverables:

  • ProjectPathResolver with fallback logic
  • InputFileMetadataService basic implementation
  • Unit tests passing
  • Feature flag: enable_dual_read_filesystem (default: true)

Deployment: Deploy to dev, run integration tests, promote to staging

Risk: Low (read-only, backward compatible)

Phase 2: Migration Service (Week 2)​

Deliverables:

  • FileStructureMigrationService with dry-run support
  • CLI tool for bulk upgrades
  • Integration tests with real legacy projects
  • Feature flag: enable_file_structure_migration (default: false)

Deployment: Deploy to dev, test migration on cloned projects

Risk: Medium (write operations, but preserves legacy structure)

Phase 3: Frontend Integration (Week 3)​

Deliverables:

  • FileMetadataListComponent showing file list (project settings)
  • PageTocHierarchicalComponent for hierarchical navigation (TOC sidebar)
  • LegacyUpgradeBannerComponent prompting users
  • User-initiated migration workflow
  • E2E tests in Cypress

Deployment: Deploy to dev, user acceptance testing

Risk: Low (UI only, backend already deployed)

Phase 4: Production Rollout (Week 4)​

Deliverables:

  • Enable feature flags in production
  • Monitor error rates and performance
  • Gradual rollout: 10% β†’ 50% β†’ 100% of users
  • Rollback plan prepared

Deployment: Canary deployment, monitor metrics

Risk: Low (extensive testing, rollback available)

Monitoring and Observability​

Key Metrics​

  1. Read Performance:

    • page_read_latency_ms (p50, p95, p99)
    • path_cache_hit_rate (target: > 80%)
    • legacy_fallback_rate (should decrease over time)
  2. Migration Success:

    • migrations_total (count)
    • migrations_successful (count)
    • migrations_failed (count)
    • migration_duration_seconds (histogram)
  3. File Metadata:

    • files_with_metadata_percent (target: 100%)
    • classification_accuracy (manual validation)

Alerts​

  1. Critical:

    • legacy_fallback_rate > 50% (indicates new structure not working)
    • page_read_latency_p99 > 2000ms (performance regression)
    • migrations_failed / migrations_total > 0.05 (5% failure rate)
  2. Warning:

    • path_cache_hit_rate < 60% (cache ineffective)
    • files_without_metadata > 10 (metadata generation failing)

Rollback Plan​

Immediate Rollback (< 1 hour)​

Scenario: Critical bug detected in production

Steps:

  1. Disable feature flag: enable_dual_read_filesystem = false
  2. Revert to previous deployment
  3. All reads go directly to legacy pages/ structure
  4. No data loss (legacy structure preserved)

Partial Rollback (Specific Projects)​

Scenario: Migration failed for specific projects

Steps:

  1. Identify affected projects
  2. Delete files/ folder for those projects
  3. Pages automatically fall back to legacy pages/ structure
  4. No functionality lost

Data Recovery​

Scenario: Accidental data loss (unlikely due to preservation)

Steps:

  1. Legacy pages/ folder is never deleted (configured via preserve_legacy_structure = true)
  2. Restore from Cloud Storage versioning if needed
  3. Re-run migration with fixed logic

Performance Considerations​

Path Caching​

  • In-memory cache with 1-hour TTL
  • Reduces filesystem checks by 80%+
  • Cache invalidation on migration

Lazy Metadata Loading​

  • Metadata loaded on-demand, not preemptively
  • List operations return minimal metadata
  • Full metadata fetched when needed

Parallel Migration​

  • Multiple projects can be migrated concurrently
  • Pages within a project migrated sequentially (safer)
  • Configurable concurrency limit

Security Considerations​

RBAC Integration​

  • Migration requires OWNER permissions
  • File metadata respects project-level permissions
  • Admin bulk upgrades logged for audit trail

Data Integrity​

  • Checksums verified during migration
  • Transactional migrations (all-or-nothing where possible)
  • Legacy structure preserved for rollback

βœ… Implementation Status (October 2025)​

COMPLETED FEATURES​

All core functionality has been successfully implemented and is working in production:

Backend Infrastructure βœ…β€‹

  • InputFileMetadataService: Complete metadata generation and management
  • ProjectPathResolver: Intelligent dual-read with caching (modern β†’ legacy fallback)
  • Atomic File Operations: GCS generation-based Compare-and-Set for race condition prevention
  • File-Aware gRPC API: Enhanced GetArchitecturalPlanPageRequest with file_id parameter
  • Thread-Safe Metadata Updates: Retry logic with exponential backoff for concurrent operations
  • Comprehensive Logging: Detailed debugging information throughout the system

Frontend Integration βœ…β€‹

  • Hierarchical Table of Contents: Expandable file containers with nested pages
  • File-Aware Navigation: URLs include file ID (/files/{file_id}/pages/{page_number}/{tab})
  • Enhanced File Headers: Two-line layout with document type, visual emphasis, and proper spacing
  • File-Aware Page Selection: Correct highlighting and content loading per file
  • Page Overlap Detection: Scoped to individual files (not project-wide)
  • Automatic UI Refresh: Updates after background ingestion task completion
  • Intelligent Caching: Prevents unnecessary data reloads and race conditions

Multi-File Support βœ…β€‹

  • File-Aware PDF Loading: Backend correctly serves PDFs from specific files
  • Concurrent Ingestion: Multiple files can be processed simultaneously without conflicts
  • File-Specific Operations: Page ingestion, overlap detection, and metadata updates per file
  • Backward Compatibility: Legacy single-file projects continue to work seamlessly

KEY ARCHITECTURAL DECISIONS MADE​

  1. File ID Strategy: Auto-incrementing integers (1, 2, 3...) for readable URLs
  2. Path Resolution: Modern structure first, legacy fallback with caching
  3. Metadata Updates: Atomic operations using GCS object generations
  4. UI Pattern: Angular Material expansion panels for hierarchical navigation
  5. URL Structure: File-aware routes with backward compatibility redirects
  6. Caching Strategy: Path-based caching with file-aware cache keys

PRODUCTION DEPLOYMENT STATUS​

  • βœ… Backend Services: Deployed and operational
  • βœ… Frontend UI: Hierarchical navigation working
  • βœ… gRPC API: File-aware endpoints functional
  • βœ… Database Schema: Metadata structure implemented
  • βœ… Migration Support: Dual-read compatibility active

Future Enhancements​

  1. AI Document Classification: Use LLM to classify document types with higher accuracy
  2. Content-Based Summarization: Generate AI summaries of file contents
  3. Automatic Metadata Refresh: Periodically update metadata for stale files
  4. Advanced Search: Full-text search across file metadata and content
  5. File Versioning: Track changes to input files over time
  6. Multi-File Coordination: Batch upload with relationship tracking
  7. Custom Metadata Fields: User-defined tags and labels
  8. Analytics Dashboard: Visualize file types, processing times, storage usage

File Index Structure​

Purpose​

The files/index.json file serves a single purpose:

  • File ID Generation: Maintains auto-increment counter for new files

Schema​

Location: projects/{projectId}/files/index.json

{
"next_file_id": 4,
"files": [
{
"file_id": "1",
"file_name": "architectural-plans.pdf"
},
{
"file_id": "2",
"file_name": "electrical-plans.pdf"
},
{
"file_id": "3",
"file_name": "structural-plans.pdf"
}
]
}

Why No page_to_file_map?

Initially considered mapping page numbers to file IDs, but this is fundamentally flawed for multi-file projects:

  • ❌ Ambiguous: Page "1" exists in multiple files (architectural, electrical, structural)
  • ❌ Not scalable: Can't map file-scoped page numbers to files
  • βœ… Solution: Frontend/API must always pass both file_id AND page_number

Page Number Semantics:

  • Modern projects: Page numbers are file-scoped (each file has pages 1, 2, 3...)
  • Legacy projects: Page numbers are project-global (single file, sequential)
  • Migration: Legacy global pages β†’ Modern file-scoped pages (e.g., page 46 β†’ file 2, page 1)

Benefits:

  • βœ… Simple, unambiguous schema
  • βœ… Single source of truth for next file ID
  • βœ… Small file size (few KB even with hundreds of files)
  • βœ… No page number collisions

Update Strategy:

  • Updated when new files are uploaded
  • Updated when files are deleted
  • Read-only for page lookups (frontend tracks file_id separately)

Open Questions​

  1. Q: How do we track which pages belong to which file?
    A: Use InputFileMetadata.extracted_pages field (stored in files/{file_id}/metadata.json)

  2. Q: What if users upload duplicate files?
    A: Detect duplicates using MD5 checksum, prompt user to replace or keep both

  3. Q: How to handle pages that don't belong to any input file?
    A: Create a "miscellaneous" file entry with ID unknown-source

  4. Q: Should migration be reversible (downgrade from modern to legacy)?
    A: Not initially - legacy structure is preserved, so just delete files/ to "downgrade"

Success Criteria​

Phase 1 (Infrastructure):

  • βœ… All legacy projects continue to work unchanged
  • βœ… No performance regression (< 5% latency increase)
  • βœ… 100% backward compatibility test coverage

Phase 2 (Migration):

  • βœ… > 95% migration success rate
  • βœ… Zero data loss incidents
  • βœ… Legacy structure preserved in all cases

Phase 3 (Frontend):

  • βœ… File metadata visible in UI (project settings page)
  • βœ… Hierarchical page navigation implemented (TOC sidebar)
  • βœ… User-initiated upgrades working
  • βœ… Positive user feedback on new features

Phase 4 (Adoption):

  • βœ… > 50% of active projects upgraded within 3 months
  • βœ… File metadata used in search/filter features
  • βœ… Reduced support tickets about file organization

References​