Query Task Timing Utility
Overview
The query-task-timing.sh script analyzes task timing data from Firestore to help optimize progress percentage calculations and identify performance bottlenecks in background tasks.
Script Location: cli/sdlc/utils/query-task-timing.sh
Purpose
This utility provides detailed timing analysis for background tasks (plan ingestion, code applicability, compliance report) by:
- ✅ Analyzing step durations - Shows how long each step takes
- ✅ Calculating proportions - Determines actual time distribution vs. assumed percentages
- ✅ Identifying bottlenecks - Highlights slow steps that need optimization
- ✅ Optimizing progress bars - Provides data to calibrate progress percentage calculations
- ✅ Performance tracking - Compare timing across different task runs
Usage
Basic Syntax
./cli/sdlc/utils/query-task-timing.sh <taskId> [project]
Parameters:
taskId- Task ID to analyze (UUID format)project- (Optional) GCP project ID (default:construction-code-expert-test)
Examples
Analyze Plan Ingestion Task
# Analyze plan ingestion task
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d
# Analyze from dev environment
./cli/sdlc/utils/query-task-timing.sh a7326bb7-9d32-442c-85bc-0807541f893d construction-code-expert-dev
Output:
⏱️ Analyzing task timing data
📁 Project: construction-code-expert-test
🆔 Task ID: a7326bb7-9d32-442c-85bc-0807541f893d
📥 Fetching task data from Firestore...
✅ Task data fetched successfully!
📈 Analyzing Progress Steps...
📈 Progress Steps Analysis:
Step | Title | Duration (ms) | Duration (s) | Status
-----|-----------------------------------------------|---------------|-------------|--------
1 | Initializing plan ingestion | 1234 | 1.2 | COMPLETED
2 | Extracting pages from PDF | 45678 | 45.7 | COMPLETED
3 | Converting pages to markdown | 123456 | 123.5 | COMPLETED
4 | Analyzing building codes | 67890 | 67.9 | COMPLETED
5 | Generating compliance report | 34567 | 34.6 | COMPLETED
🎯 Duration Analysis:
Total Duration: 272825ms (272s)
📊 Proportional Progress Percentages:
Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
1 | 7% | 1234ms | 0% | 0%
2 | 15% | 45678ms | 16% | 16%
3 | 23% | 123456ms | 45% | 61%
4 | 30% | 67890ms | 24% | 85%
5 | 38% | 34567ms | 12% | 97%
✅ Analysis complete!
Analyze Code Applicability Task
./cli/sdlc/utils/query-task-timing.sh <code-applicability-task-id>
Analyze Compliance Report Task
./cli/sdlc/utils/query-task-timing.sh <compliance-report-task-id>
Output Explanation
Progress Steps Analysis
Shows each step with:
- Step Number - Sequential step identifier
- Title - Human-readable step name (from proto annotations)
- Duration (ms) - Time taken in milliseconds
- Duration (s) - Time taken in seconds
- Status - Step completion status
Duration Analysis
- Total Duration - Sum of all step durations
- Helps identify which steps take the most time
Proportional Progress Percentages
Compares current progress calculation with actual timing data:
- Current % - Current progress percentage assigned to this step
- Actual Duration - Measured time for this step
- Proportional % - What percentage this step should get based on actual timing
- Cumulative % - Running total of progress
Use Case: If "Proportional %" differs significantly from "Current %", you should adjust the progress percentage annotations in your proto file.
Use Cases
1. Optimize Progress Bar Accuracy
When users report inaccurate progress bars:
# 1. Get task ID from logs or UI
TASK_ID="a7326bb7-9d32-442c-85bc-0807541f893d"
# 2. Analyze timing
./cli/sdlc/utils/query-task-timing.sh $TASK_ID
# 3. Compare "Current %" vs "Proportional %"
# If they differ significantly, update proto annotations
Example: If "Extract Pages" shows:
- Current %: 15%
- Proportional %: 45%
Then update task.proto:
EXTRACT_PAGES = 2 [
(step_title) = "Extracting pages from PDF",
(step_progress_percent) = 45 // Updated from 15 based on timing data
];
2. Identify Performance Bottlenecks
Find slow steps that need optimization:
# Analyze multiple task runs
for TASK_ID in $TASK_IDS; do
echo "Analyzing $TASK_ID..."
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | grep "Duration Analysis" -A 5
done
# Look for steps that consistently take >60 seconds
Action Items:
- Steps >60s: Consider breaking into smaller steps
- Steps >120s: Investigate for optimization opportunities
- Steps with high variance: Check for external dependencies
3. Compare Task Performance
Compare timing across different environments or versions:
# Dev environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-dev > timing-dev.txt
# Test environment
./cli/sdlc/utils/query-task-timing.sh $TASK_ID construction-code-expert-test > timing-test.txt
# Compare
diff timing-dev.txt timing-test.txt
4. Track Performance Improvements
Before and after optimization:
# Before optimization
./cli/sdlc/utils/query-task-timing.sh $OLD_TASK_ID > timing-before.txt
# Deploy optimized code
# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > timing-after.txt
# Calculate improvement
echo "Before: $(grep "Total Duration" timing-before.txt)"
echo "After: $(grep "Total Duration" timing-after.txt)"
5. Debug Stuck Tasks
When a task appears stuck at a specific percentage:
# Analyze the stuck task
./cli/sdlc/utils/query-task-timing.sh $STUCK_TASK_ID
# Check:
# - Which step is it stuck on?
# - Is that step taking unusually long?
# - Are there any error messages?
# Cross-reference with logs
gcloud logging read "jsonPayload.taskId=$STUCK_TASK_ID" --limit 100
Integration with Other Tools
With fetch-firestore-object.sh
This script internally uses fetch-firestore-object.sh to retrieve task data:
# query-task-timing.sh calls:
./fetch-firestore-object.sh tasks $TASK_ID $PROJECT_ID
# Then parses the progress_steps array
With jq for Custom Analysis
Extract specific timing data:
# Get just step durations
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {print $3}' | \
xargs
# Calculate average step duration
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {sum+=$3; count++} END {print "Average:", sum/count "ms"}'
In Performance Monitoring Scripts
#!/bin/bash
# monitor-task-performance.sh
TASK_IDS=$(gcloud firestore documents list tasks --filter="status=COMPLETED" --limit 10 --format="value(name)" | cut -d'/' -f6)
echo "Task ID,Total Duration (ms),Slowest Step,Slowest Duration (ms)"
for TASK_ID in $TASK_IDS; do
OUTPUT=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID 2>&1)
TOTAL=$(echo "$OUTPUT" | grep "Total Duration:" | awk '{print $3}' | sed 's/ms//')
SLOWEST_STEP=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $2}' | xargs)
SLOWEST_DURATION=$(echo "$OUTPUT" | grep -A 100 "Progress Steps Analysis" | sort -t'|' -k3 -nr | head -2 | tail -1 | awk -F'|' '{print $3}' | xargs)
echo "$TASK_ID,$TOTAL,$SLOWEST_STEP,$SLOWEST_DURATION"
done
Understanding Progress Steps
Firestore Data Structure
Progress steps are stored as an array of maps in Firestore:
{
"progress_steps": {
"arrayValue": {
"values": [
{
"mapValue": {
"fields": {
"stepNumber": {"doubleValue": 1},
"name": {"stringValue": "Initializing plan ingestion"},
"durationMs": {"stringValue": "1234"},
"status": {"stringValue": "COMPLETED"}
}
}
}
]
}
}
}
Step Status Values
PENDING- Step not yet startedIN_PROGRESS- Step currently executingCOMPLETED- Step finished successfullyFAILED- Step failed with errorSKIPPED- Step skipped (conditional logic)
Best Practices
1. Analyze Completed Tasks
For accurate timing data, analyze completed tasks:
# Good - completed task
./cli/sdlc/utils/query-task-timing.sh $COMPLETED_TASK_ID
# Less useful - in-progress task (incomplete timing data)
./cli/sdlc/utils/query-task-timing.sh $IN_PROGRESS_TASK_ID
2. Sample Multiple Tasks
Don't optimize based on a single task run:
# Analyze 5-10 tasks of the same type
for i in {1..10}; do
./cli/sdlc/utils/query-task-timing.sh $TASK_ID_$i >> timing-analysis.txt
done
# Calculate averages
grep "Total Duration" timing-analysis.txt | awk '{sum+=$3} END {print "Average:", sum/NR "ms"}'
3. Document Findings
Create a timing report:
# Generate report
cat > timing-report.md <<EOF
# Task Timing Analysis - $(date +%Y-%m-%d)
## Task Type: Plan Ingestion
### Sample Size: 10 tasks
### Findings:
$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID)
### Recommendations:
- Step 3 (Convert to Markdown) takes 45% of total time
- Consider parallel processing for page conversion
- Current progress % should be adjusted from 23% to 45%
EOF
4. Update Proto Annotations
Based on timing analysis, update progress percentages in proto files:
// task.proto
enum PlanIngestionStep {
INITIALIZE = 1 [
(step_title) = "Initializing plan ingestion",
(step_progress_percent) = 1 // Takes <1% of time
];
EXTRACT_PAGES = 2 [
(step_title) = "Extracting pages from PDF",
(step_progress_percent) = 16 // Takes ~16% of time
];
CONVERT_TO_MARKDOWN = 3 [
(step_title) = "Converting pages to markdown",
(step_progress_percent) = 45 // Takes ~45% of time (UPDATED!)
];
// ... other steps
}
Troubleshooting
No progress_steps Found
❌ No progress_steps found in task data
💡 Available fields:
[
"taskId",
"taskType",
"status",
"progressPercent"
]
Possible Causes:
- Task is too old (before progress tracking was added)
- Task type doesn't support step-by-step progress
- Task failed before any steps were recorded
Solution: Use a more recent task or check if the task type supports progress steps.
Invalid Task ID
❌ Failed to fetch task data:
❌ Document not found or access denied
Solution: Verify the task ID is correct and exists in the specified project.
bc Command Not Found
bash: bc: command not found
Solution: Install bc (basic calculator):
# macOS
brew install bc
# Ubuntu/Debian
sudo apt-get install bc
Advanced Usage
Export to CSV
# Export timing data to CSV
./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep -A 100 "Progress Steps Analysis" | \
awk -F'|' 'NR>2 {print $1","$2","$3","$4","$5}' > timing-data.csv
Aggregate Multiple Tasks
#!/bin/bash
# aggregate-timing.sh - Analyze multiple tasks
TASK_IDS=(
"task-id-1"
"task-id-2"
"task-id-3"
)
echo "Step,Avg Duration (ms),Min Duration (ms),Max Duration (ms)"
for STEP in {1..13}; do
DURATIONS=()
for TASK_ID in "${TASK_IDS[@]}"; do
DURATION=$(./cli/sdlc/utils/query-task-timing.sh $TASK_ID | \
grep "^[[:space:]]*$STEP |" | \
awk -F'|' '{print $3}' | \
xargs)
if [[ -n "$DURATION" ]]; then
DURATIONS+=($DURATION)
fi
done
if [[ ${#DURATIONS[@]} -gt 0 ]]; then
AVG=$(echo "${DURATIONS[@]}" | awk '{sum=0; for(i=1;i<=NF;i++)sum+=$i; print sum/NF}')
MIN=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | head -1)
MAX=$(echo "${DURATIONS[@]}" | tr ' ' '\n' | sort -n | tail -1)
echo "$STEP,$AVG,$MIN,$MAX"
fi
done
Compare Task Types
# Compare plan ingestion vs compliance report
echo "Plan Ingestion:"
./cli/sdlc/utils/query-task-timing.sh $PLAN_INGESTION_TASK_ID | grep "Total Duration"
echo "Compliance Report:"
./cli/sdlc/utils/query-task-timing.sh $COMPLIANCE_REPORT_TASK_ID | grep "Total Duration"
Understanding the Analysis
Current % vs Proportional %
Current %: The progress percentage currently assigned in code (proto annotations)
Proportional %: What the percentage should be based on actual timing data
Example:
Step | Current % | Actual Duration | Proportional % | Cumulative %
-----|-----------|-----------------|----------------|-------------
3 | 23% | 123456ms | 45% | 61%
Interpretation:
- Step 3 is currently assigned 23% of progress
- But it actually takes 45% of total time
- This causes the progress bar to move slowly during this step
- Action: Update proto annotation from 23% to 45%
Cumulative Progress
Shows the running total of progress percentages:
Good Distribution (cumulative reaches ~100%):
Step 1: 10% cumulative
Step 2: 25% cumulative
Step 3: 70% cumulative
Step 4: 95% cumulative
Step 5: 100% cumulative
Poor Distribution (cumulative doesn't match reality):
Step 1: 5% cumulative (too low)
Step 2: 10% cumulative (too low)
Step 3: 55% cumulative (big jump - step 3 is slow)
Step 4: 60% cumulative (too low)
Step 5: 100% cumulative (big jump at end)
Performance Optimization Workflow
1. Collect Baseline Data
# Run timing analysis on current implementation
./cli/sdlc/utils/query-task-timing.sh $TASK_ID > baseline-timing.txt
# Identify slowest steps
grep -A 100 "Progress Steps Analysis" baseline-timing.txt | sort -t'|' -k3 -nr | head -5
2. Optimize Slow Steps
Focus on steps that take >30% of total time:
# If "Convert to Markdown" takes 45% of time:
# - Consider parallel processing
# - Optimize markdown conversion logic
# - Cache intermediate results
3. Measure Improvement
# After optimization
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID > optimized-timing.txt
# Compare
echo "Before:"
grep "Total Duration" baseline-timing.txt
echo "After:"
grep "Total Duration" optimized-timing.txt
# Calculate improvement percentage
4. Update Progress Percentages
Based on new timing data:
# Get proportional percentages
./cli/sdlc/utils/query-task-timing.sh $NEW_TASK_ID | \
grep -A 100 "Proportional Progress Percentages"
# Update proto file with new percentages
# Regenerate code: mvn clean protobuf:compile protobuf:compile-custom
Prerequisites
- gcloud CLI - Must be authenticated
- jq - JSON processor
- bc - Basic calculator (for percentage calculations)
- fetch-firestore-object.sh - Required dependency (same directory)
Installation
# Install prerequisites
brew install jq bc # macOS
sudo apt-get install jq bc # Ubuntu/Debian
# Authenticate
gcloud auth login
gcloud config set project construction-code-expert-test
See Also
- Fetch Firestore Object Utility - Fetch any Firestore document
- Background Tasks Architecture
- Developer Playbook
- Task Proto Definitions:
src/main/proto/task.proto